Welcome to the World of UNIX: December 2012

Friday 7 December 2012

Solaris Live Upgrade Procedure for Patching

Normally CPU cluster patching for Solaris OS have taken an hour to complete, it increase the maintenance downtime window. In order to overcome that Solaris having the feature called "LIVE UPGRADE" .
The main advantages of Live upgrade is minimizing the downtime and providing the system admin to revert the original OS in case of any patching failure.

In general, the Live Upgrade is briefly categorized as follows

Create a new boot environment if you haven't already.
Patch the new boot environment.
Boot from the new boot environment.
Check your results for the changes and see if they are acceptable

Following are the steps to perform Solaris Live Upgrade for Patching

Step 1: Verify Live Upgrade packages are installed

           #pkgchk -v SUNWlucfg SUNWlur SUNWluu

Step 2: Create a current & new boot environment within the same root pool (rpool)
                #lucreate -c BE_APP01 -n new new_BE_APP01

Step 3: Verify step 2
           #zfs list
          #lustatus

Step 4: Apply CPU onto the new boot environment

#cd 10_Recommend

#./installpatchset -B new_BE_APP01 --s10patchset

Step 5: Activate the new boot environment
#luactivate -s new_BE_APP01

Step 6: Reboot the server to boot to the new boot environment

#shutdown -i 6 -g 0 -y

Note: Do NOT use any other commands to reboot the server. you must use the above-mentioned command. Be patient, wait at least 5-10mins. if it still does not reboot, kindly type the command "reboot"

Step 7: Verify the status

#lustatus

Post Live Upgrade Procedures

If there are no problems, you can delete the former boot environment to save disk space by issuing the comman

#ludelete BE_APP01

For furture patching, you only need to create a new boot environment within the same root pool (rpool), by issuing the command.

#lucreate -n BE_APP01_yyyymmdd

Netbackup Important Terminologies -- Part 2

Major Daemons Running in Master/Media/Client

Master - 4 Daemons

i) bprd -- request daemon started during the system.

ii) bpsched -- schedule daemon

iii) bpdbm -- database manager started by bpsched

iv) bpjobd -- Job monitor daemon started by bpdbm

bprd --> bpsched --> bpdbm --> bpjobd

Media - 9 Daemons

i) bpcd -- Backup Client (Since media itself is a client)

ii) bpbrm -- Backup and restore daemon

iii) bpdm -- Backup disk manager

iv) bptm -- Backup tape manager

v) ltid -- Media manager daemon started during the system boot

vi) avrd -- Automatic Volume Recognition daemon

vii) vmd -- Volume Manager daemon

viii) tldd -- robotic tape library daemon

ix) tldcd -- robotic tape library controller daemon

Client - 2 daemons

i) bpcd -- NB client daemon

ii) bpbkar -- Daemon used to take backup images.

How Netbackup process works?

There are 7 steps involve in NB

In Master Server

bprd --> bpsched --> bpdbm --> bpjobd

Step 1:

* When the daemon starts "bprd" on the master server start "bpsched"

Step 2:

* "bpsched" calls bpdbm to check the policy for auto backup schedules

"bpsched" calls the child process to handle the backup

"bpdbm" calls the bpjobd to communicate with job catalog.

Step 3:

* "bpsched" child contacts the media server to start the bpcd

In Media Server

bpcd --> bpbrm --> bptm

Step 4:

* “bpcd” invokes bpbrm

* “bpbrm” starts “bptm” which requests a tape mount and spans a child process to communicate with client.

* Vmd (Volume Manager Daemon) - manages the Volume catalog and handles media requests through out the course of the backup job.

* “bpbrm” contacts the client to start the “bpcd” in client.

In Client

Step 5: "bpcd" in the client starts the "bpbkar"

Step 6: "bpbkar" reads data from client

* Communicates directly with "bpbrm" to send catalog meta data which is ultimately written to images catalog.

* Send the backup data stream to the "child bptm" process

Step 7:

* The child bptm processes passes the data to the "parent bptm"

process which writes the data directly to the destination data.

What is Volume Group and Volume Pool?

Volume Pool

Is nothing but the “grouping of tape”.
Meant Volume/media/catridge/tape.
It is a Logical Set of Tape media/Volume.
If we want to use the newly introduced tape and to write to the data, it should be under one of the “volume pool”

We can we the Volume pools by the following ways.
In GUI, Media and Device Management --> Media --> Volume Pools

In CLI, #vmpool -list_all -bx

root@MasterServer# vmpool -list_all -bx
pool                  index   max partially full   description
----------------------------------------------------------------------------------
None                     0         0                the None pool
NetBackup              1         0                the NetBackup pool
DataStore               2         0                the DataStore pool
CatalogBackup        3         0                NetBackup Catalog Backup pool
SCRATCH_pool        4         0                for scratch media
APPSERVER_LOGS    5         1                APP server logs
DATABASE_LOGS     6         3                Database full
WEBSERVER_FULL 7         1                Web server full

Volume Group

Is a Physical location for a media.
Manages a group of tape for purpose of admin actions that allows movement of tapes from slots to drive and vice versa.
All volumes in the VG should be the same media type. (HCART,DLT)
All volumes in a robotic library must belong to a volume group. You cannot add volumes to a robotic library without specifying a group.
We can view the Volume groups in 2 ways.

i) In GUI, Media and Device management --> Media --> Volume Groups
ii)In CLI, #vmquery -a |grep "media ID:"
   root@SS73VPBK01 # vmquery -a |grep "media ID:"
   media ID:              CLNU01
   media ID:              PL5001
   media ID:              PL5002
   media ID:              PL5003
   media ID:              PL5004

What are the Default Volume Pools?

There are 4 volume pools created by default

i) Netbackup --> new tapes are go to this pool after inventory

ii) None (for cleaning tapes)

iii) Datastore

iv) Catalog backup

What is Scratch Pool ?

Scratch pool is the manually created volume pool and following are the procedures of scratch pool

i) Expired Tapes (volumes) are automatically moved to scratch pool by netbackup.

ii) If some other pools used for taking backup doesn't have any tapes available, it will take empty tapes from here.

What is Multistreaming and Multiplexing?

Multistreaming

i) Sends data from single client to multiple data drives. Normally used in high bandwidth environment.

ii) Backups can be divided into streams. Different jobs in different tape.

iii) Taken place in software level.

Multiplexing

i) Send different jobs/data into a single tape. Schedule level backup

ii) It is very fast especially for small environment.

iii) Taken place in hardware level.

Thursday 6 December 2012

Solaris Hardening Procedure -- Part 1

       This articles elaborately describes about the hardening procedure of Solaris OS. Hardening process will take place in different segments of the OS, hence due to lengthy procedures have split into 3 parts.

1. Services
    a) Disabling the Restricted Services
            i) Restricted services
                  Stop the restricted services which will pose a risk to servers. The following are restricted services.
        telnet
        Uucp
        Netstat
        Comsat
        Time
        Echo
        Discard
        ftp
        tftp
        Daytime
        Rquoted
        Rexecd
        Rpc.ttdbserverd
        finger
        talk
        chargen
        ident
        systat
        yppasswdd, ypserve, ypxfrd
        services (i.e. shell, login, klogin, exec, etc.) that listen to r-commands (rlogin, rsh etc).
        ToolTalk (ttdbserverd)
        Calendar Manager (cmsd)
        statd (Unless required by NFS. See Use of NFS section for restrictions)
        sadmind (solstice admin daemon)
        rstatd
        rusersd
        rwalld
        sprayd
        automount (Solaris)
           ii) SSH client and server
        Only Secure Shell protocol version 2 is allowed, SSH protocol v1 must be disabled. It is mentione in the file /etc/ssh/sshd_config
          #Protocol 1
          Protocol 2
          iii) Disable NIS services
             #svcadm disable svc:/network/nis/server:default
           #svcadm disable svc:/network/nis/client:default
          iv) Disable Sendmail
             #svcadm disable svc:/network/smtp:sendmail

2) Desktop environments
        i) X-Windows
              X-Windows are not allowed in production, xhost must not be used.
              X-window traffic must be tunneled through SSH. To perform this comment out "X11Forwarding yes" in the file /etc/ssh/sshd_config
   ii) Desktop Environment
    DE environments are not allowed. Disable dt login service
#svcadm disable cde-login
   iii)#rm /usr/openwin/bin/xwd
   #rm /usr/openwin/bin/xwud

3) Password Security
       i) Local Unix Password Baseline
             Min no of alphabetic characters is 1
       /etc/default/passwd contains the setting MINALPHA=1
          Min no of special characters is 1
              /etc/default/passwd contains the setting"MINSPECIAL=1"
            Maximum number of repeatable characters is 1
              /etc/default/passwd contanis the MINREPEATS=1
        ii) Unix Password History
             Set Prior password history to 10
/etc/default/passwd contains the HISTORY=10
   iii) Unix Account unsuccessful login retries
              /etc/default/passwd contains "RETRIES=3"
      /etc/user_attr    contains "lock_after_retries=no" for root
           root::::auths=solaris.*,solaris.grant;profiles=Web Console Management,All;lock_after_retries=no;
       iv) Account Password life
             Password is valid for 30 days.
    #passwd -x 30 -n 7 -w 7 <username>
   v) Session Inactive
            Enable inactive login session timeout to 15 mins (300 secs)
#cat /etc/default/login
:::
TIMEOUT = 300
:::
   vi) In addition, add the following lines in /etc/default/passwd
MAXWEEKS=4
            MINWEEKS=1
            PASSLENGTH=8

        MAXWEEKS - Maximum time period that a password is valid.
      MINWEEKS - Minimum time period before a password can be changed.
        PASSLENGTH - Minimum length of a password, in characters.

4) Logging and Enabling User authentication auditing

     All Successful and failed logins are logged .
   Add "auth.info /var/log/authlog" to /etc/syslog.conf for capturing syslog events sent to LOG_AUTH. This contains information on successful and failed login and su (switch user) attempt

   #touch /var/log/authlog
   #chown root:sys /var/log/authlog
   #chmod 600 /var/log/authlog

#vi /etc/syslog.conf
auth.info /var/log/authlog

Logging only Failed Logins

#cat /etc/default/login

SYSLOG=YES

SYSLOG_FAILED_LOGINS=3

             #touch /var/adm/loginlog
   #chmod 600 /var/adm/loginlog
   #chown root:sys /var/adm/loginlog

Logging only Successful logins

              #touch /var/log/logins
      #chgrp sys logins
      #chmod 600 logins

#cat /etc/syslog.conf

local0.info /var/log/logins

Added the following entry to /etc/profile and /etc/.login:
logger -p local0.info "User $LOGNAME has logged in"

After editing the /etc/syslog.conf file restart the service

#svcadm disable system-log

#svcadm enable system-log

SU events logging

#cat /etc/default/su
SYSLOG=yes

SULOG=/var/adm/sulog

Cron commands should be logged

               #cat /etc/default/cron
   CRONLOG=YES

5) Folder and File permissions
    Set the permissions on the system important folders and files
          #chmod 755 /etc /var /var/spool
#chmod 700 /var/cron
    #chmod 750 /etc/security
      #chmod 600 /var/adm/messages /var/log/syslog /var/adm/loginlog