Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE when syncing index to disk #12

Open
johann-petrak opened this issue Feb 24, 2020 · 2 comments
Open

NPE when syncing index to disk #12

johann-petrak opened this issue Feb 24, 2020 · 2 comments

Comments

@johann-petrak
Copy link
Contributor

Not sure where to put this, as I got it when running Mimir as pulled in from Prospector.
After running the indexing, I wanted to sync to disk and got a blank screen, and the following on the console:

java.lang.reflect.InvocationTargetException: null
        at org.grails.core.DefaultGrailsControllerClass$ReflectionInvoker.invoke(DefaultGrailsControllerClass.java:211)
        at org.grails.core.DefaultGrailsControllerClass.invoke(DefaultGrailsControllerClass.java:188)
        at org.grails.web.mapping.mvc.UrlMappingsInfoHandlerAdapter.handle(UrlMappingsInfoHandlerAdapter.groovy:90)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
        at org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55)
        at org.grails.web.servlet.mvc.GrailsWebRequestFilter.doFilterInternal(GrailsWebRequestFilter.java:77)
        at org.grails.web.filters.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:67)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException: null
        at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:332)
        at gate.mimir.MimirIndex.requestSyncToDisk(MimirIndex.java:645)
        at gate.mimir.web.LocalIndex.sync(LocalIndex.groovy:147)
        at gate.mimir.web.IndexAdminController.sync(IndexAdminController.groovy:75)
        ... 14 common frames omitted
@ianroberts
Copy link
Member

You appear to have hit a race condition in AtomicIndex between this code (line 1478, the code that schedules a background sync):

       public Future<Long> requestSyncToDisk() throws InterruptedException {
          if(batchWriteTask == null) {
            batchWriteTask = new FutureTask<Long>(new Callable<Long>() {
              @Override
              public Long call() throws Exception {
                return writeCurrentBatch();
              }
            });
            inputQueue.put(DUMP_BATCH);
          }
          return batchWriteTask;
        }

and this code (line 1608, the thread that processes the background task requests):

            if(aDocument == DUMP_BATCH) {
              //dump batch was requested
              if(batchWriteTask != null){
                batchWriteTask.run();
              }
              batchWriteTask = null;

The sync is running to completion between the inputQueue.put and the return batchWriteTask, setting batchWriteTask back to null before it gets returned, so AtomicIndex.requestSyncToDisk returns null, which fails later when we try to put this into a queue in MimirIndex.requestSyncToDisk.

Fixing this will need careful reasoning about the concurrency properties of AtomicIndex, and may need us to introduce some uses of AtomicReference or synchronized, and I'm reluctant to get into this on my own, I'd prefer to wait til I can pair program it with you or Mark. However, as long as this doesn't reliably happen every time (i.e. you can get a successful sync eventually if you try again) then it shouldn't affect the overall integrity of the index, though it may mean that the different sub-indexes dump their batches at different times (normally they're all in lock-step, so the switch points between head/tail-0/tail-1 etc are the same for all token-* and mention-* sub-indexes).

@johann-petrak
Copy link
Contributor Author

This seems to have worked allright when I tried another time, so not urgent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants