Sunday, December 15, 2013

Troubleshooting WCCP and related data collection COMMANDS


On Router:



1. show version
2. show ip wccp
3. show ip wccp interface detail
4. show ip wccp <service> service
5. show ip wccp <service> detail
6. show ip wccp <service> internal
7. show ip wccp <61 / 62> hash



Repeat commands 4, 5 and 6 for each active service (as indicated by “show ip wccp”).

1. show process cpu history

2. show process cpu | exclude 0.00



Repeat this command 6-8 times in order to try and identify a particular process that is eating up the CPU.

1. debug ip wccp events

2. debug ip wccp packets




The debugging output can be fairly verbose, so it is probably best to turn off console logging and log the debugs to a buffer. When done capture the results via “show log” – make sure you capture everything, some terminal emulators can “chop off” text beyond 80 characters.



Information specific to Catalyst 6500

1. show tcam counts
2. show mls stat
3. show mls netflow table detail
4. show mls netflow ip  count
5. show mls netflow ip sw-installed count
6. show mls netflow ip sw-installed detail
7. show fm inteface <interface name>



Mask Assignment


1. show ip wccp <service> mask
2. show ip wccp <service> merge
3. show tcam interface <interface name> acl <in | out> ip
4. show tcam interface <interface name> acl <in | out> ip detail

Troubleshooting High CPU on 6500 by using Netdr capture and other tools.



To troubleshoot high cpu on Cisco 6500 , follow the below procedure.


When we have issues likes high CPU, we do not have much time to go through all the documents and need only few command to find out the issue and fix it.



1) Check the cpu utilization by using below command and check whether it is due to process or interrupt.

Show process high cpu | e 0.0

PU utilization for five seconds: 99/85%; one minute: Z%; five minutes: W% 
  PID  Runtime(ms)  Invoked  uSecs    5Sec   1Min   5Min TTY Process 


2) If high CPU is  due to interrupt as shown above(85%), then we have following tools to use to troubleshoot.

a) Netdr capture b) inband rp span  c) buffer capture.

Netdr capture is the best way to check what traffic is hitting to  the RP on 6500.

3) Enable the netdr capture by using below command.

debug netdr capture rx   

Note: This command does not have any impact on cpu utilization even when CPU is 99%. We are safe to use this command as I use this command with all my customers without any issues.

4) Use the below command to check the captured packets going to the cpu.

Show netdr captured-packets

Below is the example of collected output by using show netdr captured-packets.

interface Vl80, routine mistral_process_rx_packet_inlin, timestamp 10:13:22.291
dbus info: src_vlan 0x50(80), src_indx 0xB43(2883), len 0x40(64)
  bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
  68820400 00500000 0B430100 40080000 00010408 0E000008 00000010 03809CC9
mistral hdr: req_token 0x0(0), src_index 0xB43(2883), rx_offset 0x76(118)
  requeue 0, obl_pkt 0, vlan 0x50(80)
destmac 00.08.E3.FF.FD.90, srcmac 8C.73.6E.C0.7D.80, protocol 0800
protocol ip: version 0x04, hlen 0x05, tos 0x00, totlen 36, identifier 21919
  df 0, mf 0, fo 0, ttl 1, src 134.243.80.203, dst 134.243.80.254
    icmp type 8, code 0
5) As we can see in above output , that gives all the header information as from where the packet is coming and from which IP address and Src address.

6) Use regression expression with show netdr captured-packets to filter the output and find out the percentage of the same kind of packet hitting to the CPU.

Show netdr captured-packets | i interface

Show netdr captured-packets | i srcmac

Show netdr captured-packets | i destmac

Show netdr captured-packets | i src-info 

7)  Once you find out that which interface/src /src vlan you are getting the traffic most, you can either shutdown or use the access-list to block the traffic.

8) If you see that traffic is unicast flow coming to /from particular Ip address, it indicates that traffic is software switching instead hardware switching.

9) You can use the show mls cef command to find out why traffic is not switching in hardware.
  Below are some command which can be use full to get more information on mls cef hardware switching.


  Show ip  interface <gig>| i ip|CEF|UP
  Show fm summary
  show mls cef exception status 
  show mls cef maximum-routes. 
  show ip cef switching statistics 
  show ip cef switching statistics feature 
  show tcam interface <>


I hope above command would help you to give some insight regarding the high CPU troubleshooting.

Troubleshooting Cisco 3850/3650 password recovery procedure



1. Console into the switch

2. Power cycle the switch

3. Press and hold the Mode button until it glows amber

4. When in the boot loader mode ("Switch:" prompt):

++ initialize the flash system:
Switch: flash_init

++ load any helper files:

Switch: load_helper

++ display the content of the Flash location:

Switch: dir flash:

view the file system.

Directory of flash:
   13  drwx         192   Mar 01 1993 22:30:48 switch_image
   11  -rwx        5825   Mar 01 1993 22:31:59  config.text
   18  -rwx         720   Mar 01 1993 02:21:30  vlan.dat

16128000 bytes total (10003456 bytes free)

5. Create a new boot variable to bypass current STARTUP-CONFIG:
       Switch: SWITCH_IGNORE_STARTUP_CFG=1

6.Verify/Make sure the disable password recovery variable is not set to 1:

Switch: SWITCH_DISABLE_PASSWORD_RECOVERY=0



7. Boot the switch using the image you want (e.g. flash: packages.conf  or flash: switch_image)

Switch: boot flash:packages.conf     

or  

Switch:  boot flash:switch_image    <---- image you want to use

8. When in IOS XE EXEC mode, update and save the updated startup config


Switch> enable

Switch# copy startup-config to running config or

Switch# config terminal

Switch(config-if)# enable secret password

Switch(config-if)# end

Switch# copy running-config startup-config



9. Reload the switch

Switch# reload

10. Break into boot loader one more time (step 3: Press and hold the Mode button until it glows amber)

11.  When back in boot loader mode ("Switch:" prompt), remove the bypass startup config entry and reboot

12.Unset the variable:

Switch: unset SWITCH_IGNORE_STARTUP_CFG

13. Boot the switch image as done before

Switch: boot flash:packages.conf     

or
Switch:  boot flash:switch_image
<--- image you want to use.

Troubleshooting packet flow on Cisco 6500 based on SUP720 by using ELAM


1. Enable service internal (this is a hidden command so it is not available in the parser)

   Switch(config)#service internal

2. Specify the asic to capture on:

   Switch#show platform capture elam asic superman slot 5  

3. Specify the trigger for the flow we want to capture:

   Switch#show platform  capture elam trigger dbus ipv4 if ip_sa=x.x.x.x ip_da=x.x.x.x    <<<<<<<<<<< use other trigger for ARP request/ARP reply. ELAM will not triggered if ARP is incomplete.

4. Start the capture

   Switch#show platform capture elam start


5. View the status, we are looking for "elam capture completed"

   Switch#show platform capture elam stat

6. After the packet has been caught, view the data

   Switch#show platform  capture elam data


  Decode the  ELAM RESULT

1) If Flood bit is set,  plus 8 in the destination index to see the ports.



   Switch#Remote login switch
   Switch-sp#test mcast ltl index info <>


2) We can do elam on egress as well. Just try with IP  only . No source/destination MAC in trigger.

3)  use OTHER for ARP ELAM. 

Troubleshooting High CPU on 4500 switch

 

1 )  Check cpu utilization  to figure out whether it is due to process or interrupt traffic.

show proc cpu | e 0.00 or show proc cpu sorted


2) If it is due to io-base then use the below command to check which process/component like "k5cpumain" review etc.. is causing high cpu.

show proc cpu details pid <> sorted

3) check which component is utilizing more resources than expected for example in TARGET and ACTUAL CPU column.

show platform health

4) check which CPU queue status/packet drops in the queue  and find out in which queue packet is punting to CPU in the average column 5 sec /5 minut/1hours/. CPU has 64 queues.

show platform cpu packet statistics -------> Looks for the kind of queue and packet  punting to the CPU like  "Adj same If " , "SA miss ", "K5CPU main review."

Repeat the command multiple times and check counters if it is incrementing.

5) There are two option to see which packets are going to CPU in  detail.


Below are the different tools to use to capture the packet


Tool 1 : Use "debug platform cpu packet receive buffer "  and show platform cpu packets buffer. And look for the  same "Event id " as  same  queue where packets are punting to the CPU.  For example , if packets are punting to CPU in Adj same If  queue as per show platform cpu packets statistic then look for the same queue under EVENT ID in show platform cpu packets buffer.


Tool 2 :  Second option is to span the packets punting to the CPU queue by using external packet capture. Use the below comment to configure the SPAN.

Switch(config)#monitor session 1 source cpu queue all rx
Switch(config)#monitor session 1 destination interface <destination interface number>
Switch(config)#end


Tool 3 :  Identify the Interface That Sends Traffic to the CPU—Cisco IOS Software Release 12.2(20)EW and Later

Reason for high CPU : SA miss due to host flapping between interfaces.