SafeKit 7 & 8 – Knowledge Base
Known Problems, Restrictions or Changes
SK-0002, SK-0005,SK-0006,SK-0007,SK-0009, SK-0013,SK-0017,SK-0022,SK-0023,SK-0025,SK-0029, SK-0030,SK-0033,SK-0049,SK-0074,SK-0085,SK-0086 ,SK-0093,SK-0099
SK-0079,SK-0080,SK-0081,SK-0082,SK-0083,SK-0084, SK-0087, SK-0088, SK-0091, SK-0092 , SK-0094
SK-0062,SK-0063,SK-0065,SK-0066,SK-0067,SK-0068,SK-0069,SK-0070,SK-0071,SK-0072,SK-0073,SK-0075,SK-0076 ,SK-0077,SK-0078,SK-0079,SK-0080,SK-0084
SK-0062,SK-0063,SK-0064,SK-0065,SK-0066,SK-0067,SK-0068,SK-0069,SK-0070,SK-0071,SK-0072,SK-0073,SK-0078,SK-0084
SK-0038,SK-0039,SK-0040,SK-0041,SK-0042,SK-0043,SK-0044,SK-0046,SK-0047,SK-0048,SK-0050,SK-0051,SK-0052,SK-0053,SK-0054,SK-0055,SK-0056,SK-0057,SK-0058,SK-0059,SK-0060,SK-0065,SK-0066,SK-0067,SK-0078
SK-0018,SK-0035 ,SK-0036,SK-0037,SK-0039,SK-0078
SK-0018,SK-0032 ,SK-0034,SK-0039,SK-0078
SK-0018,SK-0025,SK-0026,SK-0039,SK-0028,SK-0078
SK-0010,SK-0014,SK-0015,SK-0018,SK-0019,SK-0020,SK-0021,SK-0024,SK-0025,SK-0078
SK-0003,SK-0010,SK-0014,SK-0015,SK-0028,SK-0078
SK-0001,SK-0008,SK-0012,SK-0078
Id
: SK-0001
OS / Release :
Linux / All (For SafeKit 7.0.8 see SK-0021
)
Problem :
File replication doesn't work if there is a mount point under
the replicated directory (error “JUKEBOX”)
Mars
Id : 22041
Id
: SK-0002
OS / Release :
Windows / All
Problem : With
SQL Server 2005, SafeKit sometimes stops on primary if “Boost SQL
server priority” is used
(sqlserver process uses 100% cpu and
safekit stops with IOS - ReleaseINK kernel->user error)
Solution
: Disable “Boost SQL server priority” (SQL Management
Studio => select your server =>Properties =>
Processors)
Mars Id : 21956
Id
: SK-0003
OS / Release :
Windows 2003 64-bit kernel/ 7.0.4
Problem
: “safekit kill” command doesn't work with “exit”
or “exception” option
Solution :
use “safekit kill” command with “terminate” option
Mars
Id : 20278
Id
: SK-0005
OS / Release :
Linux / All
Problem :
“safekit forcestop” doesn't complete on “nfsbox” death
Solution : Reboot your
system
Mars Id : 19565
Id
: SK-0006
OS / Release :
Linux / All
Problem :
When Oracle 10.2 is started by SafeKit, the database startup fails
with “ORA-00205: error in Identifying control file,
check
alert log for more info” error.
Solution
: set <rfs packetsize=”32768”>
in
userconfig.xml file and use SafeKit >= 7.0.1.15
Mars
Id : 21552
Id
: SK-0007
OS / Release :
All / All
Problem : “quick
configure” doesn't work for an application module built with
SafeKit 6.2
Solution : Save
the file “<SAFE>/modules/AM/web/htmllib.lua” (AM is your
application module) and replace it with
“web/htmllib.lua”
file from a SafeKit 7.0 application module. If “quick configure”
doesn't still work, update the file
“<SAFE>/modules/AM/web/index.lua”
according to the
functions defined in “web/htmllib.lua”.
Mars
Id : 22025
Id
: SK-0008
OS / Release :
All / 7.0.1
Problem : After a
migration from SafeKit 6.2 to SafeKit 7.0 (< 7.0.1.21),
SafeMonitor fails to add a new server to administer and returns the
"received error: forbIdden" message. The problem is
that SafeMonitor 7.0 tries to work with a “httpd.conf file”
coming from SafeKit 6.2 installation and not compatible.
Solution
: Replace “httpd.conf” file (under <SAFE>/web/conf)
with “httpd.conf.default” file.
Other solution : Uninstall
SafeKit 7.0. Remove the <SAFE>/web directory and re-install
SafeKit 7.0.
Fix : Fixed in SafeKit >= 7.0.1.21
Id
: SK-0009
OS
/ Release : Windows
/ All
Problem : File
attributes replication : file encryption and
file compression are not supported
Mars
Id : 20912-20913
Id
: SK-0010
OS
/ Release : Linux
/ from 7.0.4
Restriction :
Replicated directory can not be a
a root of a file system when mountover=”off” (mandatory on Linux)
See SK-0030 for a workaround
Id
: SK-0012
OS
/ Release : Linux
/ from 7.0.1
Restriction : NFS
server on RedHat 4 Update 3 does not support ACL. Thus acl
attribute for a replicated directory can not be set to “on”.
Id
: SK-0013
OS
/ Release : Linux /
All
Problem : Interface
checker doesn't work with bonding interfaces.
Id
: SK-0014
OS
/ Release : All
/ from 7.0.4
Restriction :
Failover of NFS mounts of
replicated directories from remote NFS clients are no more supported
Id
: SK-0015
OS
/ Release : All / from
7.0.4
Changes : Since
SafeKit 7.0.4.13 new attributes for rfs configuration
Configuration sample : <rfs
checktime=”30000” reitimeout=”50” async=”second”
moutover=”off” packetsize=”16384” maxnbretrans=”50”
reicommit=”1000”>
Id
: SK-0017
OS
/ Release : All /
All
Changes and Restriction :
SafeKit start blocks into wait state when a heartbeat
with ident=”flow” is configured while there is no replication
configuration (<rfs> section).
Solution
: It has been fixed in 7.5.0.11 for Linux and 7.5.0.12 for Windows. For previous releases, remove the
ident attribute.
Id
: SK-0018
OS
/ Release : Linux / from
7.0.8
Problem :
Red Hat > 4 freezes with file replication on heavy write
load.
In that case, the system hangs but the other server from the
cluster does not detect the error since network communication is
still working. You have then to reboot the broken server.
Solution
: The kernel freeze is a Linux bug.
You can try to the change kernel parameters as
follows:
Insert into
the file
/etc/sysctl.conf:
vm.dirty_ratio=5
vm.dirty_background_ratio=5
Run sysctl -p
Our tests show that these settings help to solve the problem is many cases.
Id :
SK-0019
OS
/ Release : Windows / from
7.0.8.7
Changes and Restriction :
SafeKit SNMP agent (safeagent service) does not
work.
Mars Id :
27387
Solution
: Use SafeKit >= 7.0.8.25
Id :
SK-0020
OS
/ Release : Windows / from 7.0.8.17
Changes
: Since SafeKit 7.0.8.17 the new attribute “roflags” for
rfs configuration is used to configure the behavior of file
replication when a process is accessing a replicated directory on
secondary.Values :
Id
: SK-0021
OS / Release :
Linux/ 7.0.8
Problem : When
a replicated directory ( eg. “/Tests/Repli”) contains a mounted
file-system (e.g. “/Tests/Repli/MyFileSystem”), re-integration
fails with “JUKEBOX” error.
Solution
: Use SafeKit >= 7.0.8.26 and apply these changes
:
exportopt="crossmnt"
(<rfs> configuration)
- If “/Tests/Repli” is your
replicated directory and “MyFileSystem” your file system (under
“Tests/Repli”), use this command to export the file system
:exportfs -i -o
rw,wdelay,insecure,no_root_squash,no_subtree_check
localhost.localdomain:/Tests/Repli_For_SafeKit_Replication/MyFileSystem
Id
: SK-0022
OS
/ Release : Linux / All
Problem
: Correctly exported NFS mounts
sometimes fail to mount with a "Permission denied" error.
This error prevents SafeKit module using file replication (with
<rfs>) from starting. It fails with the following error into
SafeKit log:
| 2009-06-09 08:35:23:080185 | nfsboxv3 | W |
Mount error: 13.
Mars Id
: 32204
Solution
: This is a known Linux bug,
reported in RedHat
Bugzilla - Bug 452415. The reason is that there is no mount of
nfsd on /proc/fs/nfsd while nfs service is running. Check it by
running the mount command that lists all current mounts. If the
line: nfsd on /proc/fs/nfsd type nfsd (rw) is not listed,
your system is broken for NFS. Adding this mount manually (by
running the command: /bin/mount -t nfsd nfsd /proc/fs/nfsd)
produces the correct result and NFS mounts, and thus SafeKit, become
available. If you encounter this problem on Linux SafeKit server,
the workaround is to insert into the SafeKit prestart user
script the following lines:
is_mounted=`/bin/mount
| /usr/bin/awk "\\$1 ~ /^nfsd$/ { print \\$5 }"`
if [
-z "$is_mounted" ] ; then
/bin/mount -t nfsd nfsd
/proc/fs/nfsd
fi
Id
: SK-0023
OS
/ Release : All / All
Problem
: How to temporarily disconnect file mirroring from one
SafeKit module ?
Solution : Stop
the SafeKit module and edit its configuration (userconfig.xml)
in order to :
<heart>
<heartbeat>
<server addr="192.168.1.16"/>
<server addr="192.168.1.20"/>
</heartbeat>
<heartbeat ident="flow">
<server addr="10.0.0.1"/>
<server addr="10.0.0.2"/>
</heartbeat>
</heart>
Disable
file replication configuration
For this, comment all the <rfs>
tag section by inserting <!-- (for beginning comment) and
--> (for ending comment) as shown below (warning:
comments may not be nested) :
<!--
<rfs>
<flow>
<server addr="10.0.0.1"/>
<server addr="10.0.0.2"/>
</flow>
<replicated dir=”/safedir” mode=”read_only”/>
</rfs>
-->
Then the new module configuration can be applied and the module started.
WARNING: when file mirroring is
disabled, only one server must be running in alone state. The other
server must not be started since it could run a failover with not
uptodate data. You can uninstall the module on the server to ensure
that is can not start (and reinstall it later).
Id
: SK-0024
OS
/ Release : All / from
7.0.8.25
Changes : Since
SafeKit 7.0.8.25, new degraded mode for rfs component
When
nfsbox, the main rfs component, encounters a sever error, it now
goes into degraded mode on the primary server instead of stopping.
The secondary server, if one, then runs a stopstart and blocks until
the other server comes back into default mode. This improve
operational continuity since there is no restart or failover of the
application. But in degraded mode, file mirroring and high
availability is no more provided. The alone degraded server must be
restarted as primary to come back into default mode. This is a
manual operation that must be ran by the administrator (stop-prim or
stopstart via SafeMonitor or safekit command) when it knows that
stopping the application is not critical. The other server will then
run data synchronization and become secondary.
You can read
server state to get its mode (state via SafeMonitor or safekit
command). For instance, the following shows the state of a server in
degraded mode (ALONE state and up value for resource
rfs.degraded):--------------------- mirror State
---------------------
Local (127.0.0.1) : ALONE (Service :
Available)
Resources
Name State Since
heartbeat.0 up
2009-07-23 08:22:32
heartbeat.flow up 2009-07-23
08:22:32
rfs.uptodate up 2009-07-23 08:22:37
rfs.lastprimstate
down 2009-07-23 08:22:37
rfs.swapping down 2009-07-23
08:22:32
rfs.degraded up 2009-07-23
Id
: SK-0025
OS
/ Release : All/from 7.0.8
Restriction : Rename
of directory between replicated and not replicated trees are not
supported
This restriction applies only when you
configure not replicated directories into <rfs>
tag.
For instance: <rfs>
<replicated dir="/repdir"
mode="read_only">
<notreplicated path="notrepdir"
/>
</replicated>
</rfs>
Rename of
files between replicated and not replicated trees are supported. For
instance, the operations below are allowed:mv /repdir/file
/repdir/notrepdir
mv /repdir/notrepdir/file /repdir
But,
rename of directories between replicated and not replicated trees
may lead to secondary stop-start and/or to degraded mode (cf Mars
34165, 63859 and 63864). For instance, the operations below are not supported:mv
/repdir/dir /repdir/notrepdir
mv /repdir/notrepdir/dir /repdir
Id
: SK-0026
OS
/ Release : All/Since
7.0.9.17
Change : Add
user scripts argument
This argument can be used for instance to send an e-mail on module start and stop.
While transiting from STOP to WAIT
During this transition, the scripts transition and prestart are called in the following manner:
transition STOP WAIT [ start | stopstart | stopwait
]
prestart STOP WAIT [ start | stopstart ]
STOP and WAIT arguments are for the current and next states.
start argument is set on module start (with safekit start | prim | second).
stopstart argument is set on module stop-start (with safekit stopstart either called by the user or a checker).
stopwait argument is set on module stop-start for waiting a resource (wait rules of the failover machine). But only the transition user script is called in that case.
transiting from WAIT to STOP
During this transition, the scripts poststop and transition are called in the following manner:
poststop WAIT STOP [ stop | stopstart ]
transition WAIT
STOP [ stop | stopstart | stopwait ]
WAIT and STOP arguments are for the current and next states.
stop argument is set on module stop. That is a stop that is not followed by an automatinc start.
stopstart argument is set on module stop-start (with safekit stopstart either called by the user or a checker).
stopwait argument is set on module stop-start for waiting a resource (wait rules of the failover machine). But only the transition user script is called in that case.
Id
: SK-0029
OS
/ Release : SUSE SLES 11/
All
Problem : Modules in
farm mode are unable to start because safekit vip kernel module is
not allowed to load
Solution :
You have to allow the loading of vip
kernel module. For this, set allow_unsupported_modules
to
1 in /etc/modprobe.d/unsupported-modules
Id
: SK-0030
OS
/ Release : Linux/ From
7.0.9
Problem : The module configuration fails when a replicated
directory is a mount point
Solution : Apply the following manual procedure
as work around.
This article takes the example of PostgreSQL module that set as replicated directories /var/lib/pgsql/var
and /var/lib/pgsql/data
, which are mount points.
The SafeKit module configuration fails with the error:
Error : Device or resource busy
It is the same procedure for all mounts points that must be replicated.
Detect mount points with a command line
On both nodes, check mount points with the command df -H
that returns for instance:
df -H
/dev/mapper/vg01-lv_pgs_var … /var/lib/pgsql/var
/dev/mapper/vg02-lv_pgs_data … /var/lib/pgsql/data
/var/lib/pgsql/var
and /var/lib/pgsql/data
are mount points and they must be replicated for PostgreSQL.
But the SafeKit module configuration command /opt/safekit/safekit config –m postgresql
returns
Error : Device or resource busy
What to do if a replicated directory is a mount point
/opt/safekit/modules/postgresql/userconfig.xml
:
<rfs … >
<replicated dir="/var/lib/pgsql/var" mode="read_only" />
<replicated dir="/var/lib/pgsql/data" mode="read_only" />
</rfs>
/var/lib/pgsql/var
and /var/lib/pgsql/data
/opt/safekit/safekit config –m postgresql
which should succeed (no errors)ls -l /var/lib
:
ls -l /var/lib
lrwxrwxrwx 1 root root var -> var_For_SafeKit_Replication
lrwxrwxrwx 1 root root data -> data_For_SafeKit_Replication
/etc/fstab
and change the two lines:
/dev/mapper/vg01-lv_pgs_var /var/lib/pgsql/var ext4…
/dev/mapper/vg02-lv_pgs_data /var/lib/pgsql/data ext4…
with
/dev/mapper/vg01-lv_pgs_var /var/lib/pgsql/var_For_SafeKit_Replication ext4…
/dev/mapper/vg02-lv_pgs_data /var/lib/pgsql/data_For_SafeKit_Replication ext4..
mount /var/lib/pgsql/var_For_SafeKit_Replication
and mount /var/lib/pgsql/data_For_SafeKit_Replication
Note
To protect the start of SafeKit on a non-mounted and empty directory, you can insert in userconfig.xml
the checking of a file inside the replicated directory. Example for var/
(do the same for data/
with a file inside this directory which is always present):
<replicated dir="/var/lib/pgsql/var" mode="read_only">
<tocheck path="postgresql.conf" />
</replicated>
What to do for de-configuring the module (or uninstall whole SafeKit)
If you want to deconfigure the module (or uninstall whole safekit), you must reverse this procedure by:
umount /var/lib/pgsql/var_For_SafeKit_Replication
and umount /var/lib/pgsql/data_For_SafeKit_Replication
/opt/safekit/safekit deconfig -m postgresql
/etc/fstab
to undo previous editingmount /var/lib/pgsql/var
and mount /var/lib/pgsql/data
Id
: SK-0032
OS
/ Release : Windows 2003 /
Since 7.0.10.8
Problem : Module
using <virtual_interface> (such as farm), does not start
The
module is configured with a virtual IP address on a
<virtual_interface> and the configuration succeeded. But, the
module start fails and the log contains a line saying vipplug
loading failed.
Solution : In Windows 2003, after the module configuration,
you have to access the corresponding network interface's property
sheet (the one onto which the new virtual IP address will be added)
and click OK to validate the vip driver binding. Then, the module
should start. On further references to the same network interface
(by the same module or others modules), the above procedure is not
needed.
In previous releases, vip driver binding was done during
SafeKit install on all network interfaces. Since 7.0.10, vip driver
binding are activated on demand at configuration time only on
network interfaces that needs vip driver. This avoid configuration
problems on platforms using software vlans on other network
interfaces.
In Windows 2008, the above procedure is not needed.
Id
: SK-0033
OS
/ Release : All / All
Problem : SafeKit servers
can not communicate when the firewall is on
When firewall
is turned on, you have to configure the firewall to allow
connections on SafeKit module ports. The list of used ports is
returned by the command:
safekit module getports –m AM
Id
: SK-0034
OS
/ Release : Red Hat Enterprise Linux 6 / Since
7.0.10.23
Problem : If
NetworkManager is used to manage network interfaces, SafeKit , can't
work properly in case of network failure :
When
a network cable is unplugged the network interface is unconfigured ,
and a module using <virtual_interface>, fails with
error : “vipplug config error: Can't get interface for address
...Error: environment modification need re-configuration” When the
cable is plugged again, SafeKit module start fails, and we have to
run “safekit config” again.
Problems can occur too with a
module using <real_interface>.
OS
/ Release : Red Hat Enterprise Linux 6 /
7.1.3
Problem : If
NetworkManager is used to manage network interfaces :
When
a network cable is unplugged the network interface is unconfigured ,
and a module mirror using <real_interface>, loops with
errors : "nfsboxv3 Internal error: bind failed (99) and heart bind error 99"
Solution
:
Stop NetworkManager and use system-config-network
to configure network interfaces :
On your server run :
service NetworkManager
stop
chkconfig NetworkManager off
chkconfig network
on
service network start
And run : system-config-network to manage your network interfaces.
Id
: SK-0035
OS
/ Release : Red Hat Enterprise Linux / Since 7.0.11
How to : Enable Oracle Direct NFS with SafeKit file mirroring
Since SafeKit 7.0.11, you can configure SafeKit file mirroring with Oracle 11g Direct NFS.
You have first to configure oracle for Direct NFS while SafeKit and Oracle are stopped. For this refer to the Oracle documentation . It consists in changing the ODM library by running:
cd $ORACLE_HOME/lib
cp libodm11.so libodm11.so_stub
ln –s libnfsodm11.so libodm11.so
Then you can start Oracle and check that Direct NFS is enabled. Oracle records the use of Direct NFS in alert.log and also in internal catalog v$dnfs tables. For instance, you can check
the table of servers accessed using Direct NFS by running:
su - oracle
sqlplus
system (login)
system (password)
select * from v$dnfs_servers;
pmapset="on"
. This option can be applied only on one module.
Then apply the new configuration and start the module. You can check that Oracle uses Direct NFS and connects to the nfsbox port instead of the default standard nfsd port 2049.
The nfsbox port is the nfs_port listed by the command safekit module getports -m AM
. For checking connections, read the alert.log and v$dnfs tables. You can also run the command
lsof -Pnl +M -i4
(for IPv4) or lsof -Pnl +M -i6
(for IPv6) that lists all processes connections. You should have oracle processes that connects to nfs_port.
To roll back to the standard Oracle configuration, stops the module, reconfigure it with the attribute: pmapset="on"
removed and revert Oracle configuration for Direct NFS.
Id
: SK-0036
OS
/ Release : All
/ 7.0.11
Problem : Problems in WebConsole with I9 updates.
Id
: SK-0037
OS
/ Release : All / 7.0.11
Problem
: Unable to configure virtual_addr in mirror mode
Solution : Add the following section to the configuration file (userconfig.xml)
:
<farm>
<lan>
<node name="node" addr="127.0.0.1"/>
</lan>
</farm>
Id
: SK-0038
OS
/ Release : All / Since
7.1
Change : mailsend
binary no more delivered with the SafeKit package
Since 7.1 release, mailsend
is no more delivered with the SafeKit package.
For Windows, you can download windows binary from the mailsend download area.
For Unix, you can use the mail
command instead of mailsend
. For instance, the following line, inserted in poststop script of a module,
notifies about the stop of the module:
echo "Running poststop" | mail -s "Stop module $SAFEMODULE on `hostname`" admin@mydomain.com
where "Running poststop" is the mail's body and "Stop module $SAFEMODULE on `hostname`" is the mail's subject.
Id
: SK-0039
OS
/ Release : All / Since
7.0.9
How to : Disable SSL 2 protocol into the SafeKit web server configuration
To disable insecure protocols like SSL 2.0 and weak ciphers:
SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
SSLProtocol -ALL +SSLv3 +TLSv1
SSLCipherSuite ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:!LOW:!SSLv2:!EXPORT
safekit webserver stop sync
safekit webserver start sync
Id
: SK-0040
OS
/ Release : Linux / Since 7.1.0.8
Problem
: Address IP conflict loadbalancing problems can occur if the virtual IP address is
an IPv6 address (restriction).
Id
: SK-0041
OS
/ Release : All / Since 7.1.2
Problem
: SafeKit Web console and Internet Explorer 8
The SafeKit web console may not be correctly displayed in IE 8 and returns "xml parse" errors.
Id
: SK-0042
OS
/ Release : All / Since 7.1
How to
: Configure a farm module with the spread communication protocol that it is replaced since SafeKit 7.1 by a proprietary protocol.
<farm spread="on">
<lan>
<node name="node1" addr="192.168.208.5"/>
<node name="node2" addr="192.168.208.6"/>
</lan>
</farm>
<vip>
<interface_list>
<interface arpreroute="off" check="off">
<virtual_interface type="vmac_invisible">
<virtual_addr addr="192.168.208.56" where="alias"/>
</virtual_interface>
</interface>
</interface_list>
<loadbalancing_list>
<group name="FarmProto">
<!-- Set load-balancing rule -->
<rule filter="on_port" proto="tcp" port="9000"/>
</group>
</loadbalancing_list>
</vip>
Id
: SK-0043
OS
/ Release : All / Since 7.1
How to
: Configure a mirror module with a virtual IP address mapped on a virtual MAC address.
<vip>
<interface_list>
<interface check="on">
<virtual_interface type="vmac_invisible">
<virtual_addr addr="192.168.208.56" check="off" where="one_side_alias"/>
</virtual_interface>
</interface>
</interface_list>
<loadbalancing_list>
<group name="mirrorgrp">
<rule filter="on_addr" proto="tcp" port="*"/>
</group>
</loadbalancing_list>
</vip>
<farm>
<lan>
<node name="node1" addr="127.0.0.1"/>
</lan>
</farm>
Id
: SK-0044
OS
/ Release : All / 7.1.1
Errata
: Use a third machine as spare for a mirror module (User's guide section 5.10))
safekit config
command must be issued on this machine.Id
: SK-0046
OS
/ Release : All / Since 7.1.1
Problem
: Web console problems after SafeKit upgrade
Solution
:
You have to clear your browser's cache so as to get the new web console pages. A quick way to do this is a keyboard shortcut that works on IE, Firefox, and Chrome.
Open the browser to any web page and hold CTRL and SHIFT while tapping the DELETE key. (This is NOT CTRL, ALT, DEL).
The dialog box will open to clear the browser. Set it to clear everything and click Clear Now or Delete at the bottom.
Close the browser, stop the process still running in the background if necessary, and re-open it fresh to test what wasn't working for you previously.
Id
: SK-0047
OS
/ Release : All / Since 7.1.2
Problem
: Interface checker "intf" attribute and "-I" parameter are deprecated
Solution
:
If the "intf" attribute is specified in the configuration file userconfig.xml, it is ignored and a message of level "D" "Deprecated argument -I" is emitted at runtime.
If the interface checker process intfcheck.exe is started at the command line with the extra argument "-I", eg :
safekit -r intfcheck <module> <resourcename> -A none -l <ipaddress> -I <interfacename>, the -I argument is ignored and a message of level "D" "Deprecated argument -I" is emitted at runtime.
Id
: SK-0048
OS
/ Release : All / Since 7.1.3 and < 7.1.3.5
How to
: How to administer with the web console modules installed before securing SafeKit servers (with https)
You have secured the SafeKit Web Console with https (see SafeKit User's Guide).
If modules have been installed before securing SafeKit servers, you have to deploy them again to change the administration network URL
to protocol https and port 9453 :
Id
: SK-0049
OS
/ Release : All / Since 7.0
Problem
: Web Console secured with https: Problem using literal IPv6 address
If you use https://[lIPV6]:9453/ or http://[IPV6]:9010/ where IPV6 is a literal IPv6 address, the connection fails "Internet Explorer cannot display the webpage"
See : Apache-Bugzilla-Bug 52831
Solution
: connect with https://[lIPV6]:9453/deploy.html, https://[lIPV6]:9453/monitor.html ... will work.
Or don't use literal addresses for IPv6.
Mars Id : 44424
Id
: SK-0050
OS
/ Release : Windows / Since 7.1.2.18
Problem
: Process monitoring fails when the process name contains uppercase letters
The User's Guide recommends to use the command safekit -r errdpoll_running
to get the name of running processes. The displayed name can be used to configure the process monitoring
into the <errd> section of the configuration file userconfig.xml. Since SafeKit 7.1.2.18, the displayed name is case sensitive while it should be in lower case. The reason is that the
process name comparison for the process monitoring is not case sensitive.
Solution
: When defining a process monitoring into the <errd> section of userconfig.xml, the value of the attribute name for <proc>
must in lower case. If not, the process name matching will fail.
Mars Id : 53612
Fix : In SafeKit > 7.1.3.6:
safekit -r errdpoll_running
displays the command name in lower case
Id
: SK-0051
OS
/ Release : All / Since 7.1
Problem
: The animated progress bar is not diplayed into the web console with IE11
Solution
: Follow the options below and check:
Id
: SK-0052
OS
/ Release : All / Since 7.1
Problem
: safekit modules fail to start at boot when safeagent is set to automatic start
Solution
: Follow the procedure below
Id
: SK-0053
OS
/ Release : Windows / 7.1
Problem
: LPR server : connections on Virtual IP don't work
Mars Id : 54939
Fix : From SafeKit 7.1.3.15 :
Follow the procedure below
Id
: SK-0054
OS
/ Release : All/7.1.3
Problem
: When setting the resource state in a custom checker, it logs a message in the module log even if the resource state did not changed
Mars Id : 56903
Solution : Edit your custom checker for running the command setting the resource state only if the state has changed
Id
: SK-0055
OS
/ Release : All / 7.1.3
How to
: force a not up-to-date server to automatically start as primary when the up-to-date server is not running ?
<check>
<!-- arg is the interval in sec between 2 checks -->
<custom ident="pingremote" when="pre" exec="ping_remote" arg="10"/>
<!-- 1st arg is the interval in sec between 2 checks (>=30) -->
<!-- 2nd arg is the accepted elapsed time in min since the last synchronisation time (>1) -->
<custom ident="synced" when="pre" exec="syncedcheck" arg="30 10"/>
</check>
<failover>
<![CDATA[
force_uptodate: if (heartbeat.* == down && custom.pingremote == down && custom.synced == up && rfs.uptodate == down) then rfs.uptodate=up;
]]>
</failover>
custom.pingremote
to up if responding, to down if not responding. custom.synced
to up if the data is up-to-date or not up-to-date but it was synchronised elpasedtime
minutes ago. The value for elpasedtime
is the 2nd value of the attribute arg
in the custom checker configuration: <custom ident="synced" when="pre" exec="syncedcheck" arg="30 10">
Id
: SK-0056
OS
/ Release : All/7.1.3
Problem
: Incompatible configuration options <interface arpreroute="on"
and
<virtual_interface type="vmac_invisible"
Mars Id : 57173
Solution :
<interface arpreroute="off"
when <virtual_interface type="vmac_invisible"
<interface arpreroute="on" arpelapse="60" arpinterval="5"
only when type="vmac_directed"
Id
: SK-0057
OS
/ Release : Linux RH5 and RH6 / 7.1
How to
: Use of RedHat httpd server instead of the SafeKit httpd server
/opt/safekit/safekit webserver stop
cd /opt/safekit/web
mv -f lib/libcrypto* ../private/bin
mv -f lib/libssl* ../private/bin
mv -f lib lib.safekit
mv -f modules modules.safekit
ln -s /usr/lib64/httpd/modules/ modules
/opt/safekit/safekit webserver start
Id
: SK-0058
OS
/ Release : All/ >= 7.1.3.16
Problem
: In a farm module, how to start load-balancing once the application is started and stop load-balancing before stopping the application
Id
: SK-0059
OS
/ Release : Windows / 7.1
How to
: Use of externally built httpd server instead of the SafeKit built-in httpd server
safekit webserver stop
safekit webserver start
Id
: SK-0060
OS
/ Release : Windows / 7.1.3
Problem
: Checkers start failure on module start after a crash of the server
Mars Id : 57364
Solution :
Apply the following manual procedure as work around.
safeadmin
, insert the linedel "c:\safekit\var\mapper.xml"
Id
: SK-0061
OS
/ Release : Windows 2008 and Windows 2008 R2/ All
Problem
: File replication errors that may occur when an application extends a file (most notably, in write_through mode)
This problem is a part due to a misbehaviour of the Microsoft NTFS.sys filesystem driver described on the Microsoft support site at http://support.microsoft.com/kb/976538/en-us/
Fix :When using file replication, it is mandatory to update the windows OS at least at the level indicated http://support.microsoft.com/kb/976538/en-us/. The update procedure is also described in this knowledge base entry.
Id
: SK-0062
OS
/ Release : All / 7.2
Problem
: Web console: do not use literal IPv6 addresses (e.g. 3ffe:2a00:100:7031::1)
In the SafeKit 7.2 web console, you have to fill the address of the SafeKit servers for configuring the web console inventory and the SafeKit clusters. These addresses are used by the web console for
connecting to servers (but IPv6 URL must be be surrounded in square brackets). The web console does not yet manage both format.
Mars Id : 59106
Solution : The work around is to use DNS names instead of literal IPv6 addresses.
Id
: SK-0063
OS
/ Release : Windows 2008 R2/ 7.2
Problem
: 3 nodes replication (3nodesrepli.safe) configuration fails
Mars Id : 59141
3nodesrepli.safe configuration relies on PowerShell scripts that require for a correct execution the change of the execution policy and the 4.0 version.
Solution :
Id
: SK-0064
OS
/ Release : Windows 2008 R2 / 7.2
Problem
: SafeKit drivers load fails when Windows 2008 R2 release does not include the support for SHA-2 signing and verification functionality
Fix : You have to update your system for including the support for SHA-2.
Refer to the Microsoft Security Advisory at https://technet.microsoft.com/en-us/library/security/2949927.aspx
Id
: SK-0065
OS
/ Release : All / 7.2
Problem
: With IE11, "connection error" can occur after a time when the Webconsole is secured with https.
stop and start the browser are necessary.
Solution : "Internet Options"/"Advanced"; unselect "TLS 1.0" and "TLS 1.2", select only "TLS 1.1".
Id
: SK-0066
OS
/ Release : Windows / 7.1 and 7.2
Problem
: How to configure the USN journal in Windows when namespacepolicy="3"
in <rfs>
tag
Solution : In Windows, to enable zone reintegration after reboot when the module has been properly stopped, rfs component use the NTFS USN change journal to check that saved information on zones are still valid after reboot. When the check succeeds, zone reintegration can be applied on the file; otherwise, full reintegration must be used.
To enable the use of USN change journal, set namespacepolicy="3"
in <rfs>
tag.
By default, an NTFS volume will have its USN journal active only the system drive. If the replicated directories are located on a drive different from the system drive, you have to explicitly activate the journal.
Run the following command, as an administrator, to check that the USN journal is enabled on your drive:
fsutil usn queryjournal D:
(replace D: with the desired drive).
If the command returns "Error: The volume change journal is not active"
, run the following command, as an administrator, to create the USN journal:
fsutil usn createjournal m=536870912 a=67108864 D:
(replace D: with the desired drive) ; where m
, for maximum size, specifies the maximum size, in bytes,
that NTFS allocates for the change journal and a
, for allocation delta, specifies the size, in bytes, of memory allocation that is added to the end and removed from the beginning of the change journal.
See SK-0067 before starting the module after the USN journal creation.
The default USN journal maximum size is 512 MB. If your volume contains 400,000 files or fewer, no additional configuration is required. For every 100,000 additional files on a volume containing replicated directories, increase the USN journal size by 128 MB. If files on the volume are changed or renamed frequently (regardless of whether they are part of the replica set), consider sizing the USN journal larger than these recommendations to prevent USN journal wraps, which can occur when large numbers of files change so quickly that the USN journal must discard the oldest changes to stay within the specified size limit.
The table below includes the various figures needed to create the USN journal to different amounts. Number of files m a m
maximum size in bytes
allocation delta in bytes
in MB
400 000
536 870 912
67 108 864
512
600 000
805 306 368
100 663 296
768
800 000
1 073 741 824
134 217 728
1 024
1 000 000
1 342 177 280
167 772 160
1 280
1 200 000
1 610 612 736
201 326 592
1 536
1 400 000
1 879 048 192
234 881 024
1 792
1 600 000
2 147 483 648
268 435 456
2 048
1 800 000
2 415 919 104
301 989 888
2 304
2 000 000
2 684 354 560
335 544 320
2 560
2 200 000
2 952 790 016
369 098 752
2 816
2 400 000
3 221 225 472
402 653 184
3 072
2 600 000
3 489 660 928
436 207 616
3 328
2 800 000
3 758 096 384
469 762 048
3 584
3 000 000
4 026 531 840
503 316 480
3 840
3 200 000
4 294 967 296
536 870 912
4 096
Id
: SK-0067
OS
/ Release : Windows / 7.1 and 7.2
Problem
: The start of the module hangs into the WAIT(magenta) state after creating the USN journal on the drive containing the replicated directories
The start of the module hangs into the WAIT(magenta) state with the following messages into the log of the module:
| 2017-02-23 09:05:58:454000 | nfsboxv3 | D | Directory D:\: Filesystem=NTFS (flags 3e700ff), Volume=Data
| 2017-02-23 09:06:00:302000 | rfsplug | D | Retrying nfsbox port lookup
| 2017-02-23 09:06:00:302000 | rfsplug | D | Waiting for nfsbox ready
| 2017-02-23 09:06:00:303000 | log | D | Last message repeated 2 times
| 2017-02-23 09:06:00:333000 | nfsadmin | D | Retrying nfsbox port lookup
| 2017-02-23 09:06:00:333000 | nfsadmin | D | Waiting for nfsbox initialization
This occurs when the USN journal has just been created on the drive containing the replicated directories and no access has yet be done on the drive.
Solution :After creating the USN journal and before starting the module, run any modification on the drive so as to fill the USN journal. For instance, you can create then delete a file.
Id
: SK-0068
OS
/ Release : Windows / 7.2
How to
: Use of externally built httpd server instead of the SafeKit built-in httpd server
safekit webserver stop
safekit webserver start
Id
: SK-0069
OS
/ Release : Linux / > 7.2.0.29
How to
: Use of Linux httpd server instead of the SafeKit httpd server
/opt/safekit/safekit webserver stop
/opt/safekit/safekit webserver start
Id
: SK-0070
OS
/ Release : Linux / 7.3
How to
: Use mySQL with Safekit when SELinux is "Enforcing"
mkdir: cannot create directory /var/lib/mysql: File exists
mariadb.service: main process exited, code=exited, status=1/FAILURE
mariadb.service: control process exited, code=exited status=1
Failed to start MariaDB database server.
[Note] /usr/libexec/mysqld (mysqld 5.5.44-MariaDB) starting as process 29039 ...
Warning] Can't create test file /var/lib/mysql/alambix2.lower-test
/usr/libexec/mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 13)
170426 8:55:20 [ERROR] Aborting
setenforce 0
/sbin/service auditd rotate
, to rotate the SELinux log file "/var/log/audit/audit.log"semodule -DB
, to remove "dontaudits from policy" (log becomes more verbose)systemctl start mariadb
and systemctl stop mariadb
grep mysqld /var/log/audit/audit.log | audit2allow -M NewMySQL
, 2 files are created : NewMySQL.pp and NewMySQL.te
semodule -B
semodule -i NewMySQL.pp
setenforce 1
module NewMySQL 1.0;
require {
type var_lib_t;
type mysqld_safe_t;
type nfs_t;
type mysqld_t;
class process { siginh noatsecure rlimitinh };
class sock_file { create unlink };
class lnk_file { read getattr };
class file { write getattr read lock create unlink open };
class dir { write remove_name getattr add_name };
}
#============= mysqld_safe_t ==============
#!!!! This avc has a dontaudit rule in the current policy
allow mysqld_safe_t mysqld_t:process { siginh rlimitinh noatsecure };
#!!!! This avc has a dontaudit rule in the current policy
allow mysqld_safe_t nfs_t:dir getattr;
allow mysqld_safe_t var_lib_t:lnk_file read;
#============= mysqld_t ==============
allow mysqld_t nfs_t:dir { write remove_name add_name };
allow mysqld_t nfs_t:file { write getattr read lock create unlink open };
allow mysqld_t nfs_t:sock_file { create unlink };
allow mysqld_t var_lib_t:lnk_file { read getattr };
checkmodule -M -m -o NewMySQL.mod NewMySQL.te
semodule_package -o NewMySQL.pp -m NewMySQL.mod
semodule -i NewMySQL.pp
.
Id
: SK-0071
OS
/ Release : Linux / 7.3
drop database MaBase;
ERROR 1010 (HY000): Error dropping database (can't rmdir './MaBase', errno: 13)
create database MaBase;
ERROR ...(HY000): Error creating database (can't mkdir './MaBase', errno: 13)
class dir { write remove_name getattr add_name }
with : class dir { create rmdir write remove_name getattr add_name }
allow mysqld_t nfs_t:dir { write remove_name add_name }
allow mysqld_t nfs_t:dir { create rmdir write remove_name add_name }
checkmodule -M -m -o NewMySQL.mod NewMySQL.te
semodule_package -o NewMySQL.pp -m NewMySQL.mod
semodule -i NewMySQL.pp
Id
: SK-0072
OS
/ Release : Linux / 7.3
How to
: Set SELinux to "Permissive" mode OR set only enforcement mode for MySQL to "Permissive"
setenforce 0
, to see the current mode : getenforce
semanage permissive -a mysqld_t
Id
: SK-0073
OS
/ Release : Windows / 7.2 and < 7.3.0.14
Mars Id : 62147
Fix : Fixed in SafeKit >= 7.3.0.14
Problem
: 3nodesrepli / SafeKit upgrade : After upgrade procedure, the module does not start and DR node indicator does not appear.
Set DR node
Id
: SK-0074
OS
/ Release : All / All
Mars Id : 63124
Problem
: With IE, the file may be truncated when loaded into the SafeKit Web console editor
Text files created on DOS/Windows machines have different line endings than files created on Unix/Linux. DOS uses carriage return and line feed ("\r\n") as a line ending,
which Unix uses just line feed ("\n").
In IE, the editor of the SafeKit web console may truncate files using DOS line ending format.
:set ff=unix
; then save the file"Edit"
menu, select "EOL Conversion" -> "UNIX/OSX Format"
; then save the fileId
: SK-0075
OS
/ Release : Linux / > 7.3.0.10
How to
: Configure safewebserver on SLES12
/opt/safekit/web/bin/safeapachectl
script according to the inline comments/opt/safekit/web/conf/httpd.conf.sles12
to /opt/safekit/web/conf/httpd.conf
/opt/safekit/safekit webserver start
Id
: SK-0076
OS
/ Release : Linux / > 7.1
Problem
: Could not configure cluster : got Error:incoherent local name ...
check the sysctl option net.ipv4.ip_nonlocal_bind , it must be 0.
if not, set it with command sysctl net.ipv4.ip_nonlocal_bind=0
and retry cluster configuration.
check /etc/sysctl and /etc/sysctl.d to be sure that this option is not set at boot time.
Id
: SK-0077
OS
/ Release : Linux / 7.3
Problem
: Messages : Error: INVALID_SERVICE: 'safeagent' not among existing services at safekitinstall
Remove obsolete safeagent firewalld service : firewall-cmd --remove-service=safeagent
Id
: SK-0078
OS
/ Release : All / < 7.3.0.24
Problem
: Mirror module stays into WAIT-magenta state on both nodes or failover rules do not apply
userconfig.xml
. Having 2 CDATA
sections under <failover>
leads to these behavior. For instance:
<failover>
<![CDATA[
is_alone: if(custom.checkaround == down) then restart();
]]>
<![CDATA[
is_isolated: if(custom.checkisolated == down) then stopstart();
]]>
</failover>
<failover>
<![CDATA[
is_alone: if(custom.checkaround == down) then restart();
is_isolated: if(custom.checkisolated == down) then stopstart();
]]>
</failover>
Id
: SK-0079
OS
/ Release : Windows / >= 7.4.0.16
Problem
: Module not starting correctly if cluster configuration contains DNS names
Mars Id : 69307
Solution
:
If you are using DNS names in cluster.xml, please check on all nodes that the address displayed by the “ping” command for the local DNS address is the same as the address displayed by the “nslookup” command.
If it is not the case, you need to alter the node’s Windows network configuration interface and route metric so that the above condition is fulfilled.
Since SafeKit 7.4.0.54, DNS names are resolved during the cluster configuration and IP addresses are stored into the file c:/safekit/var/cluster/cluster_ip.xml. You can check that the DNS name resolution is correct by verifying the content of this file.
Id
: SK-0080
OS
/ Release : All / < 7.4.0.16
Problem
: Module communication failures if cluster configuration contains DNS names
Workaround
:
A work-around consists in setting only IP addresses. But if you require DNS names for accessing the SafeKit web console, the work-around
consists in setting 2 lan
sections into
into the cluster configuration. One lan
definition with DNS names used only by the SafeKit web console ; one lan
definition with IP addresses
used for the framework communications. For instance, the cluster configuration may look like the following one :
<cluster>
<lans>
<lan name="default" connect="on" console="on" framework="off">
<node name="node1" addr="node1.safe"/>
<node name="node2" addr="node2.safe"/>
</lan>
<lan name="private" connect="off" console="off" framework="on">
<node name="node1" addr="172.23.188.101"/>
<node name="node2" addr="172.23.188.102"/>
</lan>
</lans>
</cluster>
Id
: SK-0081
OS
/ Release : Windows 10 Pro / 7.4
Problem
: Hyper-V module (hyperv.safe) start fails with plugwait error
Solution :
Change the execution policy as follow:
Id
: SK-0082
OS
/ Release : Windows / 7.4
Problem
: Hyper-V module (hyperv.safe) failover fails with VM import failure
SAFEVAR/modules/AM/userlog.ulog
(where SAFEVAR=c:\safekit
and AM
is your module name), you have the following message:
Import-VM: Unable to import the virtual machine due to configuration errors. Use Compare-VM to to repair the virtual machine.
The VM import during the failover is equivalent to the virtual machine (VM) migration that consists in moving the VM from physical server node1 to node2. The import may fail when the migration requirements are not met.
Solution :
Check the common requirements for HyperV VM migration depending on you Windows release number.
This requirements applies on the physical server settings (processor, Active Directory domain, ...) and the VM settings (virtual hard disks, virtual networks, ...).
For checking incompatilities, you can try to manually import the VM on node2 while SafeKit is stopped. It will logs incompatibility error messages.
One common error is because the host hardware isn't compatible. This occurs when a virtual machine has one or more snapshots, and hosts have different processor versions.
To fix this problem, shut down the virtual machine on node1 and turn on the processor compatibility setting as follow:
Id
: SK-0083
OS
/ Release : Windows / 7.4
Problem
: Some SafeKit components and modules fails on Windows
Solution :
Change the execution policy as follow:
Id
: SK-0084
OS
/ Release : All />= 7.2
Mars Id : 71535
Problem
: In mirror modules, data reintegration fails on expiration of cryptogtraphic keys
SafeKit relies on a certificate for securing module internal communications. With SafeKit <= 7.4.0.31, the validity period for this certificate is 1 year.
When the certificate expires, the module goes to ALONE/STOP with the application still running on the ALONE. The secondary fails to reintegrate with the following message:
reintegre | D | XXX clnttcp_create: socket=7 TLS handshake failed
For checking that your module is using encrypted communication, check that the file named modulekey.p12
is present in SAFE/modules/AM/conf/
(where AM is the module name).
The certificate expiration date is most of the time, 1 year after the creation date of this file. For more precise date, please contact the support.
Solution :
The solution consists in generating a new certificate (but this new one will still expire in 1 year). It can be done either with:
safekit module genkey -m AM
safekit –H "*" -E AM
safekit module genkey -m AM
safekit –H "*" -E AM
If you prefer to run with a certificate that has a longer validity period, upgrade to the SafeKit release > 7.4.0.31 that fixes it to 20 years.
Id
: SK-0085
OS
/ Release : All / All
Problem
: SafeKit may not run properly when relying on host name resolution service that is not itself highly available
Solution :
To avoid that, you must implement a robust DNS resolution policy on the nodes participating in the cluster, such as :
Id
: SK-0086
OS
/ Release : 7.4
Problem
: Before 7.4.0.48, Safewebserver service fails to start in https mode after installing certificates from external PKI
Solution :
Execute SAFE/web/bin/openssl rsa -in SAFE/web/conf/admin.key -out SAFE/web/conf/rsa-admin.key to convert admin.key in rsa format.
In the SAFE/web/conf directory, concatenate the content of admin.crt and rsa-admin.key files into the proxy.crtkey file using a text editor or CLI, and restart the safewebserver service.
Apply this procedure on each SafeKit node.
Since 7.4.0.48, this procedure is only necessary to use the webconsole with proxy=true.
Id
: SK-0087
OS
/ Release : RedHat 8/ 7.4
Problem
: User scripts executed within the SafeKit environment return, into the application log, an error with openssl version
PAM unable to dlopen(/usr/lib64/security/pam_unix.so): /lib64/libk5crypto.so.3: undefined symbol: EVP_KDF_ctrl, version OPENSSL_1_1_1b
or
symbol lookup error: /lib64/librpmio.so.8: undefined symbol: EVP_md2, version OPENSSL_1_1_0
Workaround :
This problem is probably due to a bad linking with the openssl library delivered with SafeKit, that is,
in some cases, not compatible with the one delivered with RH 8 and used by the application or commands.
User scripts are executed with LD_LIBRARY_PATH environment variable set to SafeKit libraries.
The workaround is to execute commands after unsetting LD_LIBRARY_PATH. Below an example that starts oracle into start_prim:
replace
/bin/su - oracle19 -c "/usr/local/bin/startDb"
by
(unset LD_LIBRARY_PATH ; /bin/su - oracle19 -c "/usr/local/bin/startDb" )
Id
: SK-0088
OS
/ Release : Windows / 7.4, 7.5
Problem
: Hyper-V module (hyperv.safe) failover prerequisite
SAFEVAR/modules/AM/userlog.ulog
(where SAFEVAR=c:\safekit
and AM
is your module name),
contains the following message: Import-VM : Unable to import virtual machine due to configuration errors. Please use Compare-VM to repair the virtual machine.
Solution :
Before testing the Hyper-V module failover, check the common requirements for HyperV VM migration depending on you Windows release number.
This requirements applies on the physical server settings (processor, Active Directory domain, ...) and the VM settings (virtual hard disks, virtual networks, ...).
For checking incompatilities, we recommand the following procedure:
compare-vm -path "D:\Repli-Hyper-V\VM1\Virtual Machines\8CB619CE-CFB4-45BD-908B-F123A2E0AA24.vmcx" -Register
xml
instead of vmcx
)This command lists incompatibilities if some. To get details on incompatibilities, run
$report = compare-vm -path "D:\Repli-Hyper-V\VM2\Virtual Machines\8CB619CE-CFB4-45BD-908B-F12
3A2E0AA24.XML" -Register
$report.Incompatibilities | FL
Id
: SK-0089
OS / Release :
All / 7.5.0.16
Problem :
Default failover rule for tcp checkers set to wait instead of restart
Solution : Do not configure tcp checker or upgrade to SafeKit > 7.5.0.16
Mars Id : 74378
Id
: SK-0090
OS / Release :
All / 7.5.0.16
Change :
Failover machine may generate a wakeup before checkers, with wait rules, have time to set the associated resource state to up or down.
Mars Id : 74340
Id
: SK-0091
OS / Release :
Windows / 7.4, 7.5
Problem :
Timeout during reintegration of big files (>50Gb) such as vhd files in Hyper-V module
During the file synchronisation, space on disk may need to be allocated for new or extended files. In Windows, when the file is large or zero filled,
a timeout may occurs during the synchronisation if the primary or the reintegration process writes at the end of the file. This leads to synchronisation failures.
This problem may occur with the Hyper-V module (hyperv.safe) where VM disks are implemented by big vhd files.
Solution :
Edit the module XML configuration file SAFE/modules/AM/conf/userconfig.xml (replace AM by the name of the module) and add the option allocthreshold
into the <rfs> section as follow:
<rfs allocthreshold="50"
When allocthreshold > 0, fast allocation of disk space is enabled for files to be synchronized on the secondary node
The allocation is applied only:
safekit second fullsync
)
Id
: SK-0092
OS
/ Release : Linux / >= 7.4.0.50 & < 7.5
Problem
: SafeKit web server don't start when using LDAP/AD basic authentication on some Linux distribution (RedHat/CentOS 8)
Solution :
The solution consists in using the Apache HTTP server provided by the Linux distribution.
On SafeKit version > 7.4.0.50 :
A new option has been added to safekitinstall
: -extsafewebserver
for switching to external web server during the SafeKit install.
A new script has been added: SAFE/web/bin/setsafewebserver
for switching between internal and external web server:
setsafewebserver internal
: switch to the SafeKit built-in Apache HTTP serversetsafewebserver external
: switch to the the Linux distribution Apache HTTP server-n
: do no start of the web server after settinghttpd
package is installed (at least release 2.4.37) and that httpd
binary is present under /usr/sbin
mod_ssl
package is installedmod_ldap
package is installed if you need LDAP/AD basic authenticationmod_session
package is installed (this package is needed for SafeKit 7.5)yum install httpd mod_ssl mod_ldap
fulfill theses conditions.apr
, apr-util
, apr-util-ldap
and apr-util-openssl
packages are also to be installed if they have not been installed as dependencies.
Id
: SK-0093
OS
/ Release : Linux / All
Problem
: SafeKit web server don't start when using port 80
Port 80 is a reserved port that could be bind only by root processes or processes that have the needed capability
Solution :
As root , run the command : setcap 'cap_net_bind_service=+ep' /opt/safekit/web/bin/httpd
Id
: SK-0094
OS
/ Release : Windows / 7.4, 7.5
Problem
: SafeKit Replication of anti-ransomware folders
To configure protected folders, use Windows Security; select Virus & heart protection and Manage ransomware protection.
Set Controlled access to on and select Protected folders to add folders.
Solution :
To use SafeKit to replicate such directories, you have to allow SafeKit apps to access the protected folders.
Select Allow an app through Controlled access folder, Add an allowed app and Browse all apps.
Then add the following apps :
Id
: SK-0095
OS
/ Release : Linux / 7.5.2
Problem
: one_side VIP and src routes limitations
On the PRI server where a one_side VIP is configurated, the route src are setted to the VIP for :
Id
: SK-0096
OS
/ Release : Linux / SafeKit >= 7.5.2.11
Problem
: Zone reintegration is not operational in Linux
JIRA
Id : ES-659
This is a regression that will be corrected in a future version. It is not critical, but it does result in more data being recopied than necessary during reintegration, as zone-based reintegration optimization is disabled.
Id
: SK-0097
OS
/ Release : All / SafeKit 8.2.0 to 8.2.2.2
Problem
: Nodes sometimes show "Connection error" even when only one is down
JIRA
Id : ES-650
On the console loading, if the console is connected to node2 and node1 is down (with the alphabetical order of node names being important),
the console displays a ‘connection error’ for all nodes. However, only node1 should be displayed with this state.
This issue does not occur when the console is already loaded and the node1 goes down.
Fix : Fixed in SafeKit 8.2.2.3
Id
: SK-0098
OS
/ Release : All / SafeKit >= 8.2.3
Problem: Unable to login to the web console after the OpenId connection expired
JIRA
Id : ES-723
Once the OpenId connection has expired, the web console do not present the login page but only unauthorized page.
SAFE/web/conf/httpd.webconsoleopenidauth.conf
and uncomment the lines
# Circumvent Console quirks: worker fetches index.html with header Sec-Fetch-Dest set to 'empty' ... So it would get 401 instead of going to the login screen.
OIDCUnAuthAction 401 "%{HTTP:X-Requested-With} == 'XMLHttpRequest' \
|| ( -n %{HTTP:Sec-Fetch-Mode} && %{HTTP:Sec-Fetch-Mode} != 'navigate' ) \
|| ( -n %{HTTP:Sec-Fetch-Dest} && %{HTTP:Sec-Fetch-Dest} != 'document' && %{HTTP:Sec-Fetch-Dest} != 'empty' ) \
|| ( ( %{HTTP_ACCEPT} !~ m#text/html# ) \
&& ( %{HTTP_ACCEPT} !~ m#application/xhtml\+xml# ) \
&& ( %{HTTP_ACCEPT} !~ m#\*/\*# ) )"
SAFE/safekit webserver restart
Id
: SK-0099
OS
/ Release : All / All
How to
: Configure promiscuous mode in hypervisor network
vmac_invisible
virtual interface option, it is required that the network interfaces of the machines on which SafeKit is installed support the promiscuous mode.
For the promiscuous mode to work, it must be configured in the hypervisor settings of the virtual switch or of the virtual network cards, depending on the hypervisor.
ping
command will not work either).
Note that if the type
of the SafeKit virtual_interface
is not vmac_invisible
, but instead is vmac_directed
, the virtual IP address will be reachable regardless of whether the promiscuous mode is configured or not.