Friday, 7 December 2012

Solaris Live Upgrade Procedure for Patching

      Normally  CPU cluster patching for Solaris OS have taken an hour to complete, it increase the maintenance downtime window. In order to overcome that Solaris having the feature called  "LIVE UPGRADE" .
         The main advantages of Live upgrade is minimizing the downtime and providing the system admin to revert the original OS in case of any patching failure.
       In general, the Live Upgrade is briefly categorized as follows
  • Create a new boot environment if you haven't already.
  • Patch the new boot environment.
  • Boot from the new boot environment.
  • Check your results for the changes and see if they are acceptable 

Following are the steps to perform Solaris Live Upgrade for Patching

Step 1: Verify Live Upgrade packages are installed
           #pkgchk -v SUNWlucfg SUNWlur SUNWluu

Step 2: Create a current & new boot environment within the same root pool (rpool)
                #lucreate -c BE_APP01  -n new new_BE_APP01

Step 3: Verify step 2
           #zfs list 

Step 4: Apply CPU onto the new boot environment 
           #cd 10_Recommend
           #./installpatchset -B new_BE_APP01 --s10patchset

Step 5: Activate the new boot environment
           #luactivate -s new_BE_APP01

Step 6: Reboot the server to boot to the new boot environment
           #shutdown -i 6 -g 0 -y

   Note: Do NOT use any other commands to reboot the server. you must use the above-mentioned command. Be patient, wait at least 5-10mins. if it still does not reboot, kindly type the command "reboot"

Step 7: Verify the status

Post Live Upgrade Procedures
  • If there are no problems, you can delete the former boot environment to  save disk space by issuing the comman
          #ludelete BE_APP01  
  • For furture patching, you only need to create a new boot environment within the same root pool (rpool), by issuing the command.
         #lucreate -n BE_APP01_yyyymmdd  

Netbackup Important Terminologies -- Part 2

Major Daemons Running in Master/Media/Client

Master - 4 Daemons
           i) bprd       --  request daemon started during the system.
          ii) bpsched  -- schedule daemon
         iii) bpdbm    -- database manager started by bpsched
         iv) bpjobd    -- Job monitor daemon started by bpdbm
               bprd --> bpsched --> bpdbm --> bpjobd

 Media - 9 Daemons 
          i) bpcd   -- Backup Client (Since media itself is a client)
         ii) bpbrm -- Backup and restore daemon
        iii) bpdm  -- Backup disk manager
        iv) bptm   -- Backup tape manager
         v) ltid     -- Media manager daemon started during the system boot
        vi) avrd    -- Automatic Volume Recognition daemon
       vii) vmd    -- Volume Manager daemon 
      viii) tldd     -- robotic tape library daemon
        ix) tldcd   -- robotic tape library controller daemon

Client - 2 daemons  
        i) bpcd -- NB client daemon
       ii) bpbkar -- Daemon used to take backup images.

How Netbackup process works?
        There are 7 steps involve in NB
     In Master Server 

             bprd --> bpsched --> bpdbm --> bpjobd
        Step 1:
             *  When the daemon starts "bprd" on the master server start "bpsched" 
       Step 2:
             *  "bpsched"  calls bpdbm to check the policy for auto backup schedules 
                  "bpsched"  calls the child process to handle the backup
               "bpdbm"  calls the bpjobd to communicate with job catalog. 
         Step 3:
           "bpsched" child contacts the media server to start the bpcd
     In Media Server
             bpcd --> bpbrm --> bptm 
      Step 4: 
           * “bpcd” invokes  bpbrm
           * “bpbrm”  starts “bptm”  which requests a tape mount and spans a child process to communicate  with client. 
           * Vmd (Volume Manager Daemon) - manages the Volume catalog and handles media requests through out the course of the backup job.
           * “bpbrm”  contacts the client to start the “bpcd” in client. 
    In Client 
        Step 5:  "bpcd"  in the client starts the "bpbkar"
        Step 6:  "bpbkar" reads data from client
             * Communicates directly with "bpbrm" to send catalog meta data which is ultimately written to images catalog. 
             * Send the backup data stream to the "child bptm" process
       Step 7: 
             * The child bptm processes passes the data to the "parent bptm"
process which writes the data directly to the destination data.
What is Volume Group and Volume Pool?

Volume Pool
  • Is nothing but the “grouping of tape”. 
  • Meant  Volume/media/catridge/tape.
  • It is a Logical Set of Tape media/Volume.
  • If we want to use the newly introduced tape and to write to the data, it should be under one of the “volume pool” 
    We can we the Volume pools by the following ways. 
         In GUI,  Media and Device Management --> Media --> Volume Pools
         In CLI, #vmpool -list_all -bx
  root@MasterServer# vmpool -list_all -bx
  pool                    index   max partially full   description
  None                     0         0                the None pool
  NetBackup              1         0                the NetBackup pool
  DataStore               2         0                the DataStore pool
  CatalogBackup        3         0                NetBackup Catalog Backup pool
  SCRATCH_pool        4         0                for scratch media
  APPSERVER_LOGS    5         1                APP server logs
  DATABASE_LOGS     6         3                Database full
  WEBSERVER_FULL    7         1                Web server full

Volume Group
  • Is a Physical location  for a media. 
  • Manages a group of tape for purpose of admin actions that allows movement of tapes from slots to drive and vice versa.
  • All volumes in the VG should be the same media type. (HCART,DLT)
  • All volumes in a robotic library must belong to a volume group. You cannot add volumes to a robotic library without specifying a group.
  • We can view the Volume groups in 2 ways.
      i) In GUI, Media and Device management --> Media --> Volume Groups
      ii)In CLI, #vmquery -a |grep "media ID:"
   root@SS73VPBK01 # vmquery -a |grep "media ID:"
   media ID:              CLNU01
   media ID:              PL5001
   media ID:              PL5002
   media ID:              PL5003
   media ID:              PL5004
What are the Default Volume Pools?
     There are 4 volume pools created by default
       i) Netbackup --> new tapes are go to this pool after inventory
      ii) None (for cleaning tapes)
      iii) Datastore
      iv) Catalog backup

What is Scratch Pool ?
     Scratch pool is the manually created volume pool and following are the procedures of scratch pool 
     i) Expired Tapes (volumes) are automatically moved to scratch pool by netbackup.
    ii) If some other pools used for taking backup doesn't have any tapes available, it will take empty tapes from here.

What is Multistreaming and Multiplexing?

        i) Sends data from single client to multiple data drives. Normally used in high bandwidth environment.
      ii) Backups can be divided into streams. Different jobs in different tape.
     iii) Taken place in software level. 

      i) Send different  jobs/data into a single tape. Schedule level backup
     ii) It is very fast especially for small environment. 
    iii) Taken place in hardware level.

Thursday, 6 December 2012

Solaris Hardening Procedure -- Part 1

       This articles elaborately describes about the hardening procedure of Solaris OS. Hardening process will take place in different segments of the OS, hence due to lengthy procedures have split into 3 parts. 
1. Services
    a) Disabling the Restricted Services
            i) Restricted services 
                  Stop the restricted services which will pose a risk to servers. The following are restricted services.
        yppasswdd, ypserve, ypxfrd
        services (i.e. shell, login, klogin, exec, etc.) that listen to r-commands  (rlogin, rsh etc).
        ToolTalk (ttdbserverd)
        Calendar Manager (cmsd)
        statd (Unless required by NFS. See Use of NFS section for restrictions)
        sadmind (solstice admin daemon)
        automount (Solaris)

           ii) SSH client and server
              Only Secure Shell protocol version 2 is allowed, SSH protocol v1 must be disabled. It is mentione in the file /etc/ssh/sshd_config
              #Protocol 1
              Protocol 2

          iii) Disable NIS services
             #svcadm disable svc:/network/nis/server:default   
             #svcadm disable svc:/network/nis/client:default 

          iv) Disable Sendmail
             #svcadm disable svc:/network/smtp:sendmail

2) Desktop environments
        i) X-Windows 
              X-Windows are not allowed in production, xhost must not be used.
              X-window traffic must be tunneled through SSH. To perform this comment out "X11Forwarding yes" in the file /etc/ssh/sshd_config 
       ii) Desktop Environment
              DE environments are not allowed. Disable dt login service
              #svcadm disable cde-login
       iii)#rm /usr/openwin/bin/xwd
           #rm /usr/openwin/bin/xwud

3) Password Security  
       i) Local Unix Password Baseline 
             Min no of alphabetic characters is 1
               /etc/default/passwd contains the setting MINALPHA=1
            Min no of special characters is 1
              /etc/default/passwd contains the setting"MINSPECIAL=1"
            Maximum number of repeatable characters is 1
              /etc/default/passwd contanis the MINREPEATS=1

        ii) Unix Password History 
             Set Prior password history to 10
                /etc/default/passwd contains the HISTORY=10
       iii) Unix Account unsuccessful login retries
              /etc/default/passwd contains "RETRIES=3"
              /etc/user_attr    contains "lock_after_retries=no" for root
             root::::auths=solaris.*,solaris.grant;profiles=Web Console Management,All;lock_after_retries=no;

       iv) Account Password life
             Password is valid for 30 days.
            #passwd -x 30 -n 7 -w 7 <username>

       v) Session Inactive  
            Enable inactive login session timeout to 15 mins (300 secs)  
            #cat /etc/default/login 
            TIMEOUT = 300
       vi) In addition, add the following lines in /etc/default/passwd 
            MAXWEEKS - Maximum time period that a password is valid.
            MINWEEKS - Minimum time period before a password can be changed.
            PASSLENGTH - Minimum length of a password, in characters.           

4) Logging and Enabling User authentication auditing
         All Successful and failed logins are logged .
           Add " /var/log/authlog" to /etc/syslog.conf for capturing syslog events sent to LOG_AUTH. This contains information on successful and failed login and su (switch user) attempt 

             #touch /var/log/authlog
             #chown root:sys /var/log/authlog
             #chmod 600 /var/log/authlog

             #vi /etc/syslog.conf

          Logging only Failed Logins
                #cat /etc/default/login 
             #touch /var/adm/loginlog
             #chmod 600 /var/adm/loginlog
             #chown root:sys /var/adm/loginlog
          Logging only Successful logins
              #touch /var/log/logins
              #chgrp sys logins
              #chmod 600 logins
              #cat /etc/syslog.conf
         Added the following entry to /etc/profile and /etc/.login:   
         logger -p "User $LOGNAME has logged in"

         After editing the /etc/syslog.conf file restart the service
             #svcadm disable system-log
             #svcadm  enable system-log

         SU events logging
              #cat /etc/default/su

         Cron commands should be logged
               #cat /etc/default/cron

5) Folder and File permissions 
        Set the permissions on the system important folders and files 
              #chmod 755 /etc /var /var/spool
              #chmod 700 /var/cron
              #chmod 750 /etc/security
              #chmod 600 /var/adm/messages /var/log/syslog /var/adm/loginlog