At first I will take a quick look (installation and start) at the diagtool, then continue with the web diagtool. In the end you get the results with both tools: detailed traces from which you should be able to see what is causing SPNego to fail.
Diagtool
The diagtool can be downloaded from here (Note 957666 - Diagtool for Troubleshooting Security Configuration). Just copy the file to your server and start it with
go.bat conf\spnego.conf c:\usr\sap\\jc00\j2ee\configtool
Now answer the questions (ldap server, hostname, a windows username and password, Kerberos username) until you get to the part where you are asked to reproduce the problem.
Do it (access the portal from your client) and click OK. Now a ZIP file is created which you can take a look at (we will deal with that later on).
Web Diagtool
The Web diagtool (Note 1045019 - Web diagtool for collecting traces) has the advantage, that you do not have to work on the server all the time. Just deploy the file web_diagtool.ear once (using SDM) and run the tool via http://server/diagtool.
Log Analysis
Let's take a look at the log files (I will concentrate on the log files created by the web diagtool, but all this information is also available in the diagtool). At first -- like always -- we will take a look at the scenario when everything is working.
Start the WebDiagtool and select security & all and click on Go.
Click Add all.
Click Start.
Now try to log on to the server
and click on Stop and an overview page is displayed.
Click on Show All Traces (of course you can download a ZIP file version which you could sent to SAP support).
A page is displayed with general information about the system. You can also take a look at the login modules if you extend the login.modules (if you do that, you should see your login modules which can also be seen in the Visual Admin). Also all ume.properties can be displayed (but we will skip that for now).
From now on all the logs recorded are displayed. It can be a little confusing, but just scroll down until you see something spnego related (warning and error messages will be displayed in a different color and are easy to find).
Below you can see one example screenshot (keep in mind: my login was successful, but I still can see Warnings -- so this is nothing to worry about).
The first important step for us is the step where the J2EE Engine is going through the SPNego template and is looking up the service user we created in the ADS as the very first step in Part 1 of this guide. ("Received no SAPLogon Ticket. Authentication stack: [spnego]" & "Looking for credentials for ...")
Here you can see that the J2EE Engine first checks if a SAP Logon Ticket is received (this is done because we have the EvaluateTicketLoginModule as the first module in our logon stack). This is not the case and the J2EE engine is now looking up the user defined in the krb5.conf file: j2ee-j2e-vmw2053@DEV16.DEV-WDF.SAP.CORP. Everything was fine (actually in this example the credentials were already cached, but you could see the acquisition in the Wireshark trace of the last blog) and the Credentials for realm DEV16.DEV-WDF.SAP.CORP were successfully acquired.
Now the J2EE Engine can continue with the next login module.
At first you will see an Access Denied error message. This is quite alright. As we could see in the HTTP traces of Part 2 the browser tries to access the J2EE engine not knowing that it has to authenticate itself. So no ticket is sent to the J2EE Engine right away and the SPNego login module cannot succeed. Only after this first call (and the received 401 -- access denied) is the client requesting a Kerberos ticket for this server from the Ticket Granting Server. So the next access to the J2EE engine (just scroll down in the logs) looks much better:
This time we still do not have a SAP Logon Ticket, but a token ("YIIE..." -- and we know from Part 2 that this is a Kerberos token) is sent along. The J2EE Engine can interpret this token and extract the servicePrincipalName (now knowing that this ticket is indeed for this J2EE Engine).
Also -- just scroll down a little -- the user credentials YANGE1@DEV16.DEV-WDF.SAP.CORP are found in the ticket.
With this information the UME can try to lookup this user. Since we used prefixbased authentication the username is split in two parts (kpnPrefix,kpnSuffix) = (YANGE1,DEV16.DEV-WDF.SAP.CORP) and the search for this user is unique: Unique search result USER.CORP_LDAP.yange1
That's it. Now the J2EE Engine can create a SAPLogonTicket and the next time we log on the EvaluateLoginModule will succeed right away!
Now to some examples when something is going wrong.
NTLM token received
This is quite easy to detect. You can see that a NTLM token and not a Kerberos token is sent. Please make sure that all the browser settings are correct and that you access the J2EE Engine with an URL that is also registered as a Service Principal Name (e.g. if you have any DNS aliases -- or even have added the servername to your local host file). For details take a look at the previous blog.
The error can also be "Error reading ASN.1 datastructure: java.io.EOFException". That's basically the same, just check the browser settings and the SPN.
Windows integrated authentication is not enabled
If you see only one call to the J2EE Engine (the access denied we talked about before), please check the settings for the Internet Explorer again. Make sure that Windows integrated authentication is enabled.
Clock skew too great (37)
When you take a look at the logs everything seems to be fine. But then -- after the GSS Context created -- you see an error:
GSSException: Failure unspecified at GSS-API level (Mechanism level: Clock skew too great (37)).
This means, that the time difference between the Client and the Server is to great (there is a default time difference for Kerberos which is usually about 5 minutes). Please check the time of both client and server. You can also try to issue a
net time /set /domain
on the client (and on the server). It will syncronize the time from the client with the one on the domain.
Integrity check on decrypted field failed (31)
If you receive this error in the log please make sure that the password of the service user has not changed. If you are unsure, delete the folder kerberos (see Part 2) and set the username / password again. Make also sure that you selected Use DES encryption types for this account and run the Wizard all over. Before trying again clean your ticket cache (e.g. by logging of and on again). It could be that the ticket cached is wrong and you have to request a new one from the TGS (acutally this took me quite some time to figure out when I was writing this blog :-).
Acquiring credentials for realm failed
Take a close look at the realm name. Is it really spelled correctly? What about uppercase / lowercase. This can also be a problem. In the logs you can see, that the realm the J2EE engine is looking for is in lowercase. But a closer look at the attributes of this user (remember: ldifde...) shows that it is in uppercase. Follow Note 1130190 - SPNego fails with "Failed to find any Kerberos Key" to correct this.
In gerneral: make sure that the data you enter is case sensitive. Take a look at the attributes of the users created and enter the data accordingly. A similar error can also appear if the Service Principal Name you entered (remember: setspn -A ...) is missing or wrong.
Wrong JDK
Unfortunately I do not have a screenshot for this just yet, but if you encounter an error like:
Error in some of the login modules.
[EXCEPTION]
com.sap.engine.services.security.exceptions.BaseLoginException: Error in some of the login modules.
...Caused by: java.lang.NullPointerException
please make sure that you are not using Java 1.4.2_14, 1.4.2_15 or 1.4.2_16. All these versions contain a bug (Note 1057474 - NullPointerException in KRB5LoginMoule)
Thanks to the comments from Marc Lehmann and Hermann Hans to my first blog it looks like we have a workaround (at least for _15 and _16). Please try to add isInitiator = false to the com.sun.security.auth.module.Krb5LoginModule in the com.sun.security.jgss.accept component via the Visual Administrator. I haven't tried this out yet, but it seems to work!
KDC has no support for encryption type
This error should not appear anymore, when you were using the Wizard. Before that you had to create the keytab file with a ktpass command and there were different versions available. If this problem occurs: please use the Wizard. If you do not / cannot do this, take a look at Note 942111 - KDC has no support for encryption type.
One last thing
While writing this blog I did also find something which was very irritating. I had setup my VMWare client once more when SPNego simply would not work. I was going through all my checks -- but it still would not work. I did traces on the J2EE Engine and with Wireshark but everything seemed to be working -- only the Kerberos ticket would not get created. Then I decided to remove the client from the Domain to a Workgroup, I restarted the client and connected it to the Domain again -- all of a sudden it was working.
That's it. As a last section I want to provide a list of notes which deal with SPNego:
Some Notes
Note 968191 - SPNego: Central Note
Note 994791 - SPNego Wizard
Note 1082560 - SAP AS Java can not start after running SPNego wizard
Note 958107 - Using Diagtool for Troubleshooting Kerberos
Note 957666 - Diagtool for Troubleshooting Security Configuration
Note 1045019 - Web diagtool for collecting traces
Note 934138 - IE browser sends NTLM token instead of Kerberos
Note 1130190 - SPNego fails with "Failed to find any Kerberos Key"
Note 1057474 - NullPointerException in KRB5LoginMoule
Note 1079609 - SPNego token cannot be decrypted
Note 956833 - Password logon and Kerberos authentication
Note 982044 - SPNego succeeds but overall logon fails
Note 1073458 - GSS exception during SPNego authentication
Note 986060 - Kerberos service user has userPassword LDAP attribute
Note 935644 - Configuring Kerberos on NW04 against Database User Store
Note 1005209 - Double Logon Screen
OK, I hope that I have covered the most important problems you can encounter when configuring SPNego. Please feel free to point me to other issues you have seen and I will try to add them here (maybe I will create a Wiki page where we can all consolidate SPNego related issues). Also, if you have other SPNego related questions, please feel free to comment and I will think about writing Part 4 ;-)