Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset the Thread count (180) and Workload size (10,000) once the Apache Geode concurrency issues are resolved #49

Open
jxblum opened this issue Jan 24, 2020 · 0 comments
Labels
task Development Task

Comments

@jxblum
Copy link
Contributor

jxblum commented Jan 24, 2020

This change must be made to the MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests class.

When the Thread count is 180 and the Workload size is 10,000, the Apache Geode Server becomes non-responsive or unavailable (i.e. it fails)!

[FORK] - 2020-01-24 00:12:20,867  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:20,868  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51421]: connection disconnect detected by EOF.
2020-01-24 00:12:21,072  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@1979076415)
2020-01-24 00:12:21,072  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
2020-01-24 00:12:21,094  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@728216184)
2020-01-24 00:12:21,094  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
2020-01-24 00:12:21,312  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@1624117850)
2020-01-24 00:12:21,312  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
2020-01-24 00:12:21,329  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@1563217080)
2020-01-24 00:12:21,329  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
2020-01-24 00:12:21,350  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@578482607)
2020-01-24 00:12:21,350  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
2020-01-24 00:12:21,627  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[localhost:51177]@880633971)
2020-01-24 00:12:21,627  WARN ode.cache.client.internal.OpExecutorImpl: 653 - Pool unexpected socket timed out on client connection=Pooled Connection to localhost:51177: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
[FORK] - 2020-01-24 00:12:21,655  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,656  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51360]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,669  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,670  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51226]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,672  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,673  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51487]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,686  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51386]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,718  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51477]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,731  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,738  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51405]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,768  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,769  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51314]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,778  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,779  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51533]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,791  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,791  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51468]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,811  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51495]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:21,892  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:21,893  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51493]: connection disconnect detected by EOF.
[FORK] - 2020-01-24 00:12:22,002  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d
[FORK] - 2020-01-24 00:12:22,009  INFO .internal.cache.tier.sockets.BaseCommand: 442 - Error applying delta for key f501b046-5466-4e42-a83e-9fd8c2b6361d of region /Sessions: Cache encountered replay of event containing delta bytes for key f501b046-5466-4e42-a83e-9fd8c2b6361d

[FORK] - 2020-01-24 00:12:22,010  WARN .internal.cache.tier.sockets.BaseCommand: 334 - Server connection from [identity(192.168.99.1(SpringBasedCacheClientApplication:7292:loner):51190:358a9bd6:SpringBasedCacheClientApplication,connection=1; port=51515]: connection disconnect detected by EOF.
java.lang.RuntimeException: Session Access Task Failed
	at org.springframework.session.data.gemfire.MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.safeFutureGet(MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.java:296)
	at java.util.stream.ReferencePipeline$4$1.accept(ReferencePipeline.java:210)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.IntPipeline.reduce(IntPipeline.java:456)
	at java.util.stream.IntPipeline.sum(IntPipeline.java:414)
	at org.springframework.session.data.gemfire.MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.runSessionWorkload(MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.java:311)
	at org.springframework.session.data.gemfire.MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.concurrentSessionAccessIsCorrect(MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.java:322)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.springframework.test.context.junit4.statements.RunBeforeTestExecutionCallbacks.evaluate(RunBeforeTestExecutionCallbacks.java:74)
	at org.springframework.test.context.junit4.statements.RunAfterTestExecutionCallbacks.evaluate(RunAfterTestExecutionCallbacks.java:84)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
	at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
	at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:251)
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:97)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:190)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: java.util.concurrent.ExecutionException: org.springframework.dao.DataAccessResourceFailureException: nested exception is org.apache.geode.cache.client.NoAvailableServersException
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at org.springframework.session.data.gemfire.MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.safeFutureGet(MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.java:293)
	... 43 more
Caused by: org.springframework.dao.DataAccessResourceFailureException: nested exception is org.apache.geode.cache.client.NoAvailableServersException
	at org.springframework.data.gemfire.GemfireCacheUtils.convertGemfireAccessException(GemfireCacheUtils.java:235)
	at org.springframework.data.gemfire.GemfireAccessor.convertGemFireAccessException(GemfireAccessor.java:93)
	at org.springframework.data.gemfire.GemfireTemplate.get(GemfireTemplate.java:172)
	at org.springframework.session.data.gemfire.GemFireOperationsSessionRepository.findById(GemFireOperationsSessionRepository.java:95)
	at org.springframework.session.data.gemfire.AbstractGemFireIntegrationTests.get(AbstractGemFireIntegrationTests.java:405)
	at org.springframework.session.data.gemfire.MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.lambda$newAddSessionAttributeTask$2(MultiThreadedHighlyConcurrentClientServerHttpSessionAccessIntegrationTests.java:200)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.geode.cache.client.NoAvailableServersException
	at org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:277)
	at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:125)
	at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108)
	at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:772)
	at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91)
	at org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116)
	at org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2793)
	at org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1470)
	at org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1443)
	at org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:188)
	at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1380)
	at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1319)
	at org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1304)
	at org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436)
	at org.springframework.data.gemfire.GemfireTemplate.get(GemfireTemplate.java:169)
	... 7 more

I tried different combinations of Thread count and Workload size:

360 Threads / 3000 Ops - PASSES

500 Threads / 3000 Ops - FAILS

500 Threads / 2000 Ops - PASSES

700 Threads / 2000 Ops - FAILS

There seems to be a correlation and threshold with Thread count and Workload size.

This problem only exists (fails) with Apache Geode 1.11.0. This test passes with the original Thread cont (180) and Workload size (10,000) with Apache Geode 1.9.2.

@jxblum jxblum changed the title Reset the Thread count (180) and Workload size (10,000) once the Apache Geode concurrency issues are resolved. Reset the Thread count (180) and Workload size (10,000) once the Apache Geode concurrency issues are resolved Jan 24, 2020
@jxblum jxblum added the task Development Task label Jan 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Development Task
Projects
None yet
Development

No branches or pull requests

1 participant