Feature Pack 8

Troubleshooting: Site content crawler errors

The site content crawler might encounter errors in WebSphere Commerce.

Problem

Missing Bouncy Castle JAR file errors occur running the site content crawler. These errors might occur when unencrypting site content such as PDF files.

An error similar to the following occurs:

00000e44 DataImporter  E org.apache.solr.common.SolrException log Full Import failed:
org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.NoClassDefFoundError: org.bouncycastle.jce.provider.BouncyCastleProvider
     at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:669)
     at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:622)
     at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:268)
     at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187)
     at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
     at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
     at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: java.lang.NoClassDefFoundError: org.bouncycastle.jce.provider.BouncyCastleProvider
     at java.lang.J9VMInternals.verifyImpl(Native Method)
     at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
     at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
     at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1324)
     at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:796)
     at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:89)
     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
     at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
     at com.ibm.commerce.solr.handler.TikaEntityProcessor.load(TikaEntityProcessor.java:276)
     at com.ibm.commerce.solr.handler.TikaEntityProcessor.initConnection(TikaEntityProcessor.java:182)
     at com.ibm.commerce.solr.handler.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:238)
     at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
     at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:596)
     ... 6 more
Caused by: java.lang.ClassNotFoundException: org.bouncycastle.jce.provider.BouncyCastleProvider
     at java.net.URLClassLoader.findClass(URLClassLoader.java:423)
     at com.ibm.ws.bootstrap.ExtClassLoader.findClass(ExtClassLoader.java:191)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:660)
     at com.ibm.ws.bootstrap.ExtClassLoader.loadClass(ExtClassLoader.java:111)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
     at com.ibm.ws.classloader.ProtectionClassLoader.loadClass(ProtectionClassLoader.java:62)
     at com.ibm.ws.classloader.ProtectionClassLoader.loadClass(ProtectionClassLoader.java:58)
     at com.ibm.ws.classloader.CompoundClassLoader.loadClass(CompoundClassLoader.java:511)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
     at com.ibm.ws.classloader.CompoundClassLoader.loadClass(CompoundClassLoader.java:511)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
     ... 20 more

Solution

Ensure that the crawler is not missing any JAR files that are required for the site content to crawl.

For example, download missing JAR files such as bcprov-jdk15.jar and bcmail-jdb15.jar from Bouncy Castle.