Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding utility threads for anti-cache eviction #175

Open
apavlo opened this issue Sep 6, 2014 · 15 comments
Open

Adding utility threads for anti-cache eviction #175

apavlo opened this issue Sep 6, 2014 · 15 comments

Comments

@apavlo
Copy link
Owner

apavlo commented Sep 6, 2014

The following is a rough outline for how to add support for an additional thread to operate "down" in the EE while the main PartitionExecutor thread processes transactions. This is not possible in the current architecture because there is a single shared buffer that we use to pass data + error codes between the Java layer and the C++ layer (through JNI). The mapping between the Java and C++ layers is as follows:

  • Exception Buffer: ExecutionEngineJNI.exceptionBuffer -> `VoltDBEngine::m_reusedResultBuffer``
  • VoltTable Result Buffer: ExecutionEngineJNI.deserializer (this is a wrapper for the ByteBuffer) -> VoltDBEnginer::m_reusedResultBuffer

Note that the memory is allocated in Java and then we pass down the pointers to the C++ layer.

To give an example why this is a problem now with the existing AntiCacheManager implementation, I will now discuss a race condition that can occur. The Java AntiCacheManager has its own thread that it uses to unevict data at a partition. If this uneviction process encounters an error, then it will write a SerializedException into that partition's shared buffer. If the PartitionExecutor is processing a txn at the same time it may trip its own exception and want to write into that same buffer. If the other thread is trying to deserialize the exception, then the contents will get collobered. This is a race condition that we have definitely seen crop up before.

  1. What needs to happen is that we need to make a separate buffer for data and exceptions for utility operations. This will allow us to evict and unevict data in a separate thread without worrying about overwriting the main buffers. Add this new utility buffer to ExecutionEngineJNI and update the parameters to ExecutionEngine.nativeSetBuffers() to pass down this new buffer pointer. You can see how we did the same thing with ExecutionEngineJNI.ariesLogBuffer. You will need to update VoltDBEngine::setBuffers() accordingly.
  2. Modify VoltDBEngine::antiCacheReadBlocks() to use this new utility buffer when there is an exception. You can see the FIXME in the code that makes reference to this problem. You will also need to modify ExecutionEngineJNI.antiCacheReadBlocks() to look for errors in the new utility buffers.
  3. The final step is to now add an additional utility thread for evicting data without needing to block transactions. This is more complicated because we need to collect blocks of cold tuples to evict but also make sure that those tuples are being used by an active txn. We may want to add a new global flag in the EE that tells us that the eviction thread is doing something and keep track of the read-write sets of any txn that executes during this brief window. When we're ready to do the eviction, we then set a lock to prevent txns from executing, remove any tuples from the block we're about to evict that are also in the active txn's ReadWriteSet, and then write out the block. I'm not sure how we just want to do this just yet because I don't want to have to check for a lock every time we normally execute txns. We can talk about this problem when you get to this point.
@mjgiardino
Copy link
Contributor

Edit: Nevermind, I think I found the problem. I'm throwing the Exception all the way up but ExecutionEngineJNI isn't checking the right buffer for the exception causing a failure.

In point 2: "You will also need to modify ExecutionEngineJNI.antiCacheReadBlocks() to look for errors in the new utility buffers."

What kind of errors should I be checking for?

After adding the additional buffer, I'm failing three tests, two of which are related to not catching an UnknownBlockAccessException correctly. I am pretty sure I have initialized and passed the new buffer correctly though it must not be passing exceptions it back up to the Java frontend.

Here is the latest commit.

Thanks.

@mjgiardino
Copy link
Contributor

Any idea what could be causing this? The logs of commit 52 just have this single error and I'm not sure how to diagnose the source. I'm not sure why this test would have an issue with the changes I've made as it only should affect the AntiCaching.

[junit] Running org.voltdb.regressionsuites.TestPlansGroupBySuite
[junit] 
[junit] Exception: java.lang.NullPointerException thrown from the UncaughtExceptionHandler in thread "ServerThread"
[junit] Running org.voltdb.regressionsuites.TestPlansGroupBySuite
[junit]     org.voltdb.regressionsuites.TestPlansGroupBySuite:testDistributedSumAndGroup-localCluster-2-2-JNI had an error.
[junit] Tests run:   0, Failures:   0, Errors:   1, Time elapsed: 0.00 sec

@zheguang
Copy link
Contributor

How should I go about testing item 1 and 2? In general, when would AntiCacheEvictionManager::readBlock throw an exception, besides when it's a system error such as UnknownBlockAccess?

@mjgiardino
Copy link
Contributor

To be honest, the Exception mechanisms are not my strongest coding area, so if I were doing it by myself, I would simply query the AntiCacheEvictionManager for an Abort true/false. Exceptions are probably a better-engineered solution.

AntiCacheEvictionManager::readBlock() could throw a (to be implemented) AbortAndReissueException when the block needed is from an SSD or disk backing store. I don't have a function/method yet to relay that information but the AntiCacheEvictionManager will know for which AntiCacheDBs it will stall and for which ones transactions will be aborted and reissued. Perhaps a boolean-returning method TransactionAbort? Something like that is what I would implement.

@zheguang
Copy link
Contributor

Hi Andy and Michael, I have implemented 1 and 2 and it has passed the tests on Jenkins. The commits are: 0bf9ff6..641c861. I have started looking at 3 and have some thoughts about how to handle conflicting reads and writes while evicting data. I will sync up with Stan first and flesh out a base design for discussion with you guys.

Thanks,
sam

@mjgiardino
Copy link
Contributor

I think I'm at a good place to sync as well. The migration between layers works and blocks can be found in any layer. In addition all the multilevel configuration is tested. I just now added a method in AntiCacheDB to identify whether it is a stalling or aborting layer. We need to discuss how we're going to decide this, as well as what specific policy we'd like to start with for block placement.

@apavlo
Copy link
Owner Author

apavlo commented Oct 28, 2014

More than just passing existing test cases, do you add test cases for the new features?

@mjgiardino
Copy link
Contributor

There are new tests in the EE (anticache_eviction_manager_test) to test the physical act of migration and LRU block selection, as well as a new junit test (TestAntiCacheMultiLevel) to configure multilevel, evict and merge tuples, as well as fill a level and be forced to write to the one below. They are based upon your edu.brown.hstore.TestAntiCacheManager test.

@apavlo
Copy link
Owner Author

apavlo commented Oct 28, 2014

Beautiful. Should we merge this code back into the master?

@mjgiardino
Copy link
Contributor

Let me rerun those performance tests overnight and I'll submit a pull request tomorrow. I want to skim the code and make sure any hacky debugging printfs are gone.

@apavlo
Copy link
Owner Author

apavlo commented Oct 28, 2014

Ok. Let's try to schedule a call for this Friday. Can you send an email to the group?

@mjgiardino
Copy link
Contributor

Will do.

@zheguang
Copy link
Contributor

Would you guys be available to take a look at my initial patch for the second bullet point? I wrote a long commit message to convey the overall design. This however is based on my current understanding of the frontend, so please do point out what looks wrong to you.

zheguang@5537d01

Many thanks!

@mjgiardino
Copy link
Contributor

It all makes sense to me.

Should we meet tomorrow and sync up?

@apavlo
Copy link
Owner Author

apavlo commented Jan 29, 2015

Tomorrow is NEDB day, so we're all going to be busy.

On Thursday, January 29, 2015 10:53 AM Michael Giardino wrote:

It all makes sense to me.

Should we meet tomorrow and sync up?


Reply to this email directly or view it on GitHub:
#175 (comment)

Andy Pavlo
[email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants