*
News: SMF - Just Installed!


Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

Recent Posts

Pages: [1] 2 3 ... 10
1
Nutanix / NOS Upgrade: did not grant shutdown token
« Last post by PureSystemsTech on February 16, 2018, 09:55:01 AM »
All,

This post is in reference to out error you may see in the log file "Genesis.out" under /home/nutanix/data/logs in a situation where an upgrade might be hung and not continuing with the upgrade.

This happened to me on my CE cluster which is a single node cluster where the a shutdown token is not being granted to allow the upgrade to finish..

When you run upgrade_status on the CVM cli you will see the following for a while until you realize "hey this thing is hung"

Quote
nutanix@NTNX-d02003b1-A-CVM:192.168.1.41:~$ upgrade_status
2018-02-16 06:42:47 INFO zookeeper_session.py:110 upgrade_status is attempting to connect to Zookeeper
2018-02-16 06:42:47 INFO upgrade_status:38 Target release version: el7.3-release-ce-2018.01.31-stable-c3b9964290bf2f28799481fed5cf32f92ab3dadc
2018-02-16 06:42:47 INFO upgrade_status:43 Cluster upgrade method is set to: automatic rolling upgrade
2018-02-16 06:42:47 INFO upgrade_status:96 SVM 192.168.1.41 still needs to be upgraded. Installed release version: el7.3-release-ce-2017.11.30-stable-ab2ac46f51d4745d43126c9ad1871b7314400bab, node is currently upgrading

If you see this message in Genesis.out

Quote
Master 192.168.1.41 did not grant shutdown token to my ip 192.168.1.41, trying again in 30 seconds

Try to run the following command to grant a token.

Quote
echo -n '{"request_reason": "nos_upgrade", "request_time": 1496860678.099324, "requester_ip": "192.168.1.50"}' | zkwrite /appliance/logical/genesis/node_shutdown_token

Then tail the Genesis.out logs for the following messages.

Quote
Failed to read zknode /appliance/logical/genesis/node_shutdown_priority_list with error 'no node'
2018-02-16 06:45:15 INFO cluster_manager.py:4224 Successfully granted token to 192.168.1.41 reason nos_upgrade
2018-02-16 06:45:15 INFO node_manager.py:2266 Finishing upgrade to version el7.3-release-ce-2018.01.31-stable-c3b9964290bf2f28799481fed5cf32f92ab3dadc, you can view progress at /home/nutanix/data/logs/finish.out
2018-02-16 06:45:17 INFO ha_service.py:959 Checking if any nodes are shutting down due to upgrade
2018-02-16 06:45:17 INFO ha_service.py:977 Node 192.168.1.41 is going down

Now when you run upgrade_status on the CVM cli you should see that your SVM (CVM) is up to date.

Quote
nutanix@NTNX-d02003b1-A-CVM:192.168.1.41:~$ upgrade_status
2018-02-16 14:52:44 INFO zookeeper_session.py:110 upgrade_status is attempting to connect to Zookeeper
2018-02-16 14:52:44 INFO upgrade_status:38 Target release version: el7.3-release-ce-2018.01.31-stable-c3b9964290bf2f28799481fed5cf32f92ab3dadc
2018-02-16 14:52:44 INFO upgrade_status:43 Cluster upgrade method is set to: automatic rolling upgrade
2018-02-16 14:52:44 INFO upgrade_status:96 SVM 192.168.1.41 is up to date

I hope this helps! :)
2
Hi Aymen,

No there would be no production impact. You would still be able to access your nodes imms through the CMM as long as the CMM can still see them. All internal management communication goes through the CMM that includes switches as well as imms. Unmanaging the chassis from the FSM will only disable any management activities from the FSM alone while the chassis is unmanaged.

Thanks!
3
Hello team,

thank you again.

this a good idea to upgrade my flex system ;D but is there any impact on my production chassis (when I make the chassis in unmanage state). sometimes I perform Service-level reset of CMM, my database stops working.

Best Regard,
4
Hi Aymen,

My best suggestion is to upgrade all of your compute node firmware using Bootable Media Creator. Once upgraded, verify your CMM is the latest. Then from the FSM 'unmanage' the chassis and then follow the command line method for upgrading your firmware to the latest version. Once the upgrade is successful try to remanage the chassis.

Please let us know your progress.

Thanks!
5
Hello team,
thank you for your answeer
Actually i can see the compute node from CMM.
Since I upgrade the CMM to version and the compute node (Node 10 )to version, i can't log into the IMM of the other compute node (may be a problem  of compatibility of firmware between CMM and IMM)

Now my chassis is managed but no access and my object is to upgrade my platform

Best Regards
6
Hi Aymen,

Are you able to see the compute nodes from the CMM? If not, have you upgraded the compute node firmware?

Thanks,
7
hello ,

thank you for your answeer
Actually, I have a flex system with 1 Fsm and 2 CMM, 8 compute node and two switch
Version FSM: 1.3.1
Version FSM: 2.5.1 (2PET12D)
Version Compute node: 1.2.1
My Fm is connected to the internet.
I tried to upgrade the fsm, but without success.
So I upgrade the CMM from 2.5.1 to 2.5.3u

Situation now:
All my compute nodes are running correctly, but in FSM GUI, the Communication between fsm and chassis is offline so I can?t do anything (discovery, upgrade,?)
Even more my tsm server is running correctly, but in the fsm shows bay empty
After I upgrade CMM, I lose access to IMM compute node
Hits may be I misunderstand this prerequisite
Level of Recommendations and Prerequisites for the update: (https://delivery04.dhe.ibm.com/sar/CMA/XSA/05ex8/0/ibm_fw_cmm_2pet12u2.5.3u_anyos_noarch.txt)
? If the CMM is being managed by an IBM Flex System Manager (FSM), If the CMM is being managed by an IBM Flex System Manager (FSM), you must ensure the FSM is at the 1.2.1 level.  If not, then you must update the FSM first before updating the CMM and other chassis components.  See Flex Update Best Practices for details on how to update the FSM and all chassis components

Best Regards
8
Hi Aymen,

Good afternoon, thank you for your post. I would advise trying to make sure you can upgrade the fsm. Try to remove all updates from the FSM database by running 'smcli cleanupd - mva' and then reboot. Also make sure the fsm version you are trying to upgrade to is a compatible upgrade path from your current version. If the problem persists please reply with a screenshot if the error.


Thanks!
9
Hello team
Recently i tried to upgrade my flex System(FSM, 2 CMM,  8 compute node , 2 switchs).  I followed the IBM Flex System and IBM PureFlex Firmware Updates Best Practices doc. But every time I tried to install update of FSM (cli or Gui) a problem appears ?OS not discovered or user is locked?. 
I checked the USER, it is unlocked and I discovered all the system again .the problem persists
So I forget the fsm upgrade and I tried to upgrade the CMM   from 2.5.1 to 2.5.3u it is done with success
I upgrade one compute node (one from 8 node) it done with success
Last day, i checked the  fsm GUI, I see   that the chassis is online but the communication is offline , I can?t  upgrade the other compute node   also I see that one slot is emty but there is system running in that slot

Summary: all the system is running, but in the FSM, everything is down or communication is offline

Can anyone help me?

Thanks 
10
Flex System Manager (FSM) / Re: SMIA Tool Locked Out!
« Last post by Whittenberg on July 05, 2017, 02:08:31 AM »
After doing some searching we found that the default administrator password for the smia tool is
Administrator/password.

Thanks man, that get me in.
Pages: [1] 2 3 ... 10