Hello, I am a newer user of Asqatsun. The environment has been set up successfully, page audit, file audit, manual audit, s,audit scenario are all ok. Only audit full site is not OK. The error message is like this:
Exception in thread "crawl-1512035621805 launchthread" java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 'refresh' before accessing beans via the ApplicationContext
at org.springframework.context.support.AbstractRefreshableApplicationContext.getBeanFactory(AbstractRefreshableApplicationContext.java:172)
at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:1098)
at org.archive.crawler.framework.CrawlJob.getCrawlController(CrawlJob.java:499)
at org.asqatasun.crawler.framework.AsqatasunCrawlJob$1.run(AsqatasunCrawlJob.java:210)
01-12-2017 13:59:50:558 72390646 WARN org.asqatasun.crawler.framework.AsqatasunCrawlJob - Failed to start bean 'fetchProcessors'; nested exception is java.lang.NoSuchMethodError: org.apache.commons.httpclient.HttpState.setCookiesMap(Ljava/util/SortedMap;)V
It seemed that the setting files are not right, but I don’t know where is not right, can you give me some suggestions? Thanks very much. I have attached the asqatasun-crawler-beans-site setting file in the topic. Thanks for help.
vim /var/lib/tomcat7/webapps/asqatasun/WEB-INF/classes/log4j.properties
log4j.logger.org.asqatasun.crawler=DEBUG
log4j.logger.org.asqatasun.service=DEBUG
List item
Asqatasun 4.0.3
Certainly I have built Asqatasun with maven.
Proxy, I don’t know what it is. Maybe not? How to set it? Can you give me an example?
The log level I have increased it to debug.
In asqatasun.log, it is like this.
DEBUG org.springframework.web.servlet.DispatcherServlet - Successfully completed request
DEBUG org.springframework.web.servlet.DispatcherServlet - DispatcherServlet with name 'tgol-web-app' processing POST request for [/asqatasun/home/contract/audit-site-set-up.html]
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - Site audit on http://www.hotmail.com
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param false ALTERNATIVE_CONTRAST_MECHANISM
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param Rgaa30;LEVEL_2 LEVEL
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param INFORMATIVE_IMAGE_MARKER
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param 1920 SCREEN_WIDTH
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param 86400 MAX_DURATION
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param true CONSIDER_COOKIES
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param 1000 MAX_DOCUMENTS
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param EXCLUSION_REGEXP
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param PRESENTATION_TABLE_MARKER
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param 20 DEPTH
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param DATA_TABLE_MARKER
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param 1080 SCREEN_HEIGHT
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param INCLUSION_REGEXP
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param DECORATIVE_IMAGE_MARKER
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - param COMPLEX_TABLE_MARKER
INFO org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - Launching audit site on http://www.hotmail.com
DEBUG org.springframework.web.servlet.DispatcherServlet - Rendering view [org.springframework.web.servlet.view.JstlView: name 'audit-in-progress'; URL [/WEB-INF/view/audit-in-progress.jsp]] in DispatcherServlet with name 'tgol-web-app'
DEBUG org.asqatasun.service.AuditServiceImpl - auditSite
DEBUG org.springframework.web.servlet.DispatcherServlet - Successfully completed request
DEBUG org.asqatasun.service.AuditServiceThreadQueueImpl - auditCommand polled
DEBUG org.asqatasun.service.AuditServiceThreadQueueImpl - AuditServiceThread created from auditCommand
DEBUG org.asqatasun.service.AuditServiceThreadQueueImpl - AuditServiceThread started
DEBUG org.asqatasun.webapp.orchestrator.AsqatasunOrchestratorImpl - WAIT FOR AUDIT TO COMPLETE:org.asqatasun.entity.audit.AuditImpl@76e70a0,1512355460
INFO org.asqatasun.service.command.SiteAuditCommandImpl - Launching crawler for page http://www.hotmail.com
INFO org.asqatasun.crawler.framework.AsqatasunCrawlJob - outputDir------------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - outputDir------------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun
INFO org.asqatasun.crawler.framework.AsqatasunCrawlJob - crawlConfigFilePath------------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/WEB-INF/conf/crawler/
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - crawlConfigFilePath------------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/WEB-INF/conf/crawler/
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - Directory: /opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/crawl-1512355461279 created
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - crawlConfigFilePath: /opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/WEB-INF/conf/crawler/ for copy
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - filepath : /opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/WEB-INF/conf/crawler//asqatasun-crawler-beans-site.xml
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - CONSIDER_COOKIES true
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - PRESENTATION_TABLE_MARKER
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - EXCLUSION_REGEXP
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - MAX_DOCUMENTS 1000
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - COMPLEX_TABLE_MARKER
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - DATA_TABLE_MARKER
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - SCREEN_WIDTH 1920
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - DECORATIVE_IMAGE_MARKER
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - MAX_DURATION 86400
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - LEVEL Rgaa30;LEVEL_2
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - INCLUSION_REGEXP
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - SCREEN_HEIGHT 1080
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - DEPTH 20
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - ALTERNATIVE_CONTRAST_MECHANISM false
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - INFORMATIVE_IMAGE_MARKER
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value http://www.hotmail.com/
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - true CONSIDER_COOKIES
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value true
DEBUG org.asqatasun.crawler.util.HeritrixInverseBooleanAttributeValueModifier - Update ignoreCookies attribute of bean fetchHttp with value false
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - PRESENTATION_TABLE_MARKER
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - EXCLUSION_REGEXP
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value
DEBUG org.asqatasun.crawler.util.HeritrixParameterValueModifier - [list: null] value
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - 1000 MAX_DOCUMENTS
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value 1000
DEBUG org.asqatasun.crawler.util.HeritrixAttributeValueModifier - Update maxDocumentsDownload attribute of bean crawlLimiter with value 1000
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - COMPLEX_TABLE_MARKER
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - DATA_TABLE_MARKER
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - 1920 SCREEN_WIDTH
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - DECORATIVE_IMAGE_MARKER
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - 86400 MAX_DURATION
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value 86400
DEBUG org.asqatasun.crawler.util.HeritrixAttributeValueModifier - Update maxTimeSeconds attribute of bean crawlLimiter with value 86400
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Rgaa30;LEVEL_2 LEVEL
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - INCLUSION_REGEXP
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value
DEBUG org.asqatasun.crawler.util.HeritrixParameterValueModifier - [list: null] value
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - 1080 SCREEN_HEIGHT
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - 20 DEPTH
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - Modifier found for value 20
DEBUG org.asqatasun.crawler.util.HeritrixAttributeValueModifier - Update maxHops attribute of bean tooManyHopsDecideRule with value 20
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - false ALTERNATIVE_CONTRAST_MECHANISM
DEBUG org.asqatasun.crawler.util.CrawlConfigurationUtils - INFORMATIVE_IMAGE_MARKER
INFO org.asqatasun.crawler.framework.AsqatasunCrawlJob - configFile-----------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/crawl-1512355461279/asqatasun-crawler-beans-site.xml
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - configFile-----------/opt/soft/apache-tomcat-7.0.73/webapps/asqatasun/crawl-1512355461279/asqatasun-crawler-beans-site.xml
INFO org.asqatasun.crawler.framework.AsqatasunCrawlJob - urlList-----------[http://www.hotmail.com]
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - urlList-----------[http://www.hotmail.com]
INFO org.asqatasun.crawler.framework.AsqatasunCrawlJob - heritrixFileName-----------asqatasun-crawler-beans-site.xml
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - heritrixFileName-----------asqatasun-crawler-beans-site.xml
INFO org.asqatasun.crawler.CrawlerImpl - Rel canonical pages are kept for ref Rgaa30
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - crawljob is launchable
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - Job validated
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - Starting context
WARN org.asqatasun.crawler.framework.AsqatasunCrawlJob - Failed to start bean 'fetchProcessors'; nested exception is java.lang.NoSuchMethodError: org.apache.commons.httpclient.HttpState.setCookiesMap(Ljava/util/SortedMap;)V
DEBUG org.asqatasun.crawler.framework.AsqatasunCrawlJob - Context started
But there is another log file in tomcat server called catalina.out, the logs is like this.
Dec 04, 2017 10:44:21 AM org.archive.crawler.framework.CrawlJob instantiateContainer
INFO: Job instantiated
Dec 04, 2017 10:44:21 AM org.archive.spring.PathSharingContext initLaunchId
INFO: launch id 20171204024421
Exception in thread "crawl-1512355461279 launchthread" java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 'refresh' before accessing beans via the ApplicationContext
at org.springframework.context.support.AbstractRefreshableApplicationContext.getBeanFactory(AbstractRefreshableApplicationContext.java:172)
at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:1098)
at org.archive.crawler.framework.CrawlJob.getCrawlController(CrawlJob.java:499)
at org.asqatasun.crawler.framework.AsqatasunCrawlJob$1.run(AsqatasunCrawlJob.java:210)
Above all is the two logs content.
Please forgive me verbose, I pasted two logs content here.
It looks like there are multiple version of a same library in your classpath (and so a mismatch between the compile, and the runtime context.
Can you please give us the content of the WEB-inf/lib folder to see what version HttpClient you’re using?
I suspect this library to be installed in your system, in a different version, that would explain the message error
It still does not work now. @koj said the reason may be the version mismatch, I uploaded the screen shot in the reply. Can you help check it too? Thank you very much.
I guess I found the reason of your problem.
In the snapshot you posted earlier, we can see that you have a dependency to httpclient-4.3.6.jar AND commons-httpclient-1.1.jar
They both embed a class named HttpState in the package org.apache.commons.httpclient. The way the classes are loaded by your classloader may differ from a context to another. It seems that in your case, the one from commons-httpclient-1.1.jar is used to load and the expected method does not exist.
Now, I have a question for you.
I’ve just downloaded the latest version from github
I had a look at the war archive and the commons-httpclient-1.1.jar is not present in the librairies.
I checked computer, I don’t have the commons-httpclient-1.1.jar . I only build the development environment using the command mvn clean install, it would download the dependencies auto. I didn’t find the commons-httpclient-1.1.jar , can you help share it to me? Thanks very much.
In the message in which you uploaded the list of dependencies (directory WEB-INF/lib), the commons-httpclient-3.1.jar (I made a mistake with the version yesterday) is present and SHOULD not.
We need to find how and why it is present in the version you built.
If you simply download the latest release, and install the war archive, it should works.
I add new dependency to the /asqatasun-web-app/pom.xml . I added some. I pasted the screen shot. I also uploaded pom.xml here. Please help check it. Thank you very much.
It is about Accessibility rule WCAG 2.0 https://www.w3.org/TR/WCAG20/. We add it to the Asqatasun。But it will not influence the full site audit, I think.
Thanks for your support. I have solved the issue. It is because the jar in the lib. I download the whole project from github, compare it to my project, I modified the pom.xml refering to the initial project, executed then it succeed. But there is another issue, can you help check it? Thanks very much.