JSLEE Discussions
  [Search] Search   [Recent Topics] Recent Topics   [Hottest Topics] Hottest Topics   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Load test on CGIN RA with CS1 protocol
Forum Index » Rhino SLEE Discussions
Author Message
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

Hi, I have questions regarding to load test using CGIN RA v.1.3.2.

We have implemented SBB component which use CGIN RA v.1.3.2 and communicate by INAP CS1 protocol. We have also prepared scenarios to cover functionality of our SBB. Now, during execution 1 caps sometime we can observe warnings:
on Simulator:
08:37:38,638 WARN : onEnd: Unable to find state for dtid=132

and on Rhino the same:
2010-10-27 08:37:38.624 Warning [trace.cginra.stack] <pool-3-thread-2> onEnd: Unable to find state for dtid=132

or only on Simulator:
08:39:06,186 WARN : onContinue: Unable to find state for dtid=2714577



After increase number of calls to 5 caps we can observe many such warnings, but all scenarios are still matched, 'print-status' command printout:
Current status:
Sessions started: 1531
Sessions active: 0
Session matches: 1531
Session non-matches: 0
Non-match reasons: {}
Last non-match details: n/a
Scenarios matched: 1531
Sessions matched by name: {CS1-AnnouncPlayToCPN=15, CS1-AnnouncPlayToCPN_AbandonAfterPA=12, CS1-AnnouncPlayToCPN_AbandonBeforeCTR=11, CS1-AnnouncPlayToCPN_PAError=11, CS1-ContinueAction=11, CS1-ContinueAction_Abandon=12, CS1-NormalReleaseAction=12
CS1-NormalReleaseAction_Abandon=12, CS1-NormalReleaseInternationalCallAction=11, CS1-ReroutingToCallCenter_AbandonAfter1stRRB=11, CS1-ReroutingToCallCenter_AbandonAfter2ndRRB=11, CS1-ReroutingToCallCenter_Answer=11, CS1-ReroutingToCallCenter_Call
dPartyBusy=11}
Dialogs started: 1531
Dialogs ended: 1531
Dialogs active: 0
Dialogs rejected: 0



After that, we increase caps to 30 and we observe new warning type:
on Simulator:
09:21:35,025 WARN : Received TC-END while dialog in invalid state: idle
09:21:35,026 WARN : Unable to send TC-U-ABORT(ABRT_source.dialogue_service_provider) on dialog [dialogID=2717715]


and Exception event:
on Rhino:
2010-10-27 09:50:35.477 Warning [trace.cginra.stack] <pool-3-thread-2> onEnd: Unable to find state for dtid=2755038
2010-10-27 09:51:32.647 Severe [trace.Ericpol_BWlist_CS1_1_0_SNAPSHOT.bwlist-sbb] <jr-15> [Ericpol BWlist CS1/bwlist-sbb] sbbExceptionThrown 'CCInitialDPRequestEvent[dialog=CGIN-Dialog:[dialogID=2757901],invokeId=0,argument={ serviceKey 10, calledPartyNumber { nature NATIONAL, routingToInternalNetworkNumber ALLOWED, numberingPlan ISDN, address "0663555777" }, callingPartyNumber { nature NATIONAL, numberIncomplete false, numberingPlan ISDN, presentation ALLOWED, screening NETWORK_PROVIDED, address "663444666" } }]' on 'CGIN-Dialog:[dialogID=2757901]' java.lang.IllegalMonitorStateException
2010-10-27 09:51:32.647 WARN [rhino.er.stage.eh] <jr-15> sbb invocation threw exception, causing transaction to be marked for rollback
javax.slee.TransactionRolledbackLocalException: Unchecked exception thrown by sbbDeliverEvent
at com.opencloud.ocbb.container.impl.Util.makeRollbackException(26600:14)
at com.opencloud.deployed.Service_Ericpol_BWlist_CS1_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOT_1.SBB_bwlist_sbb_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOTOCBB_Local.sbbDeliverEvent(SBB_bwlist_sbb_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOTOCBB_Local.java:1424)
at com.opencloud.ob.RhinoSDK.T.a(26600:164)
at com.opencloud.ob.RhinoSDK.T.a(26600:136)
at com.opencloud.ob.RhinoSDK.T.e(26600:88)
at com.opencloud.ob.RhinoSDK.rE.run(26600:86)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at com.opencloud.ob.RhinoSDK.cS$b$a$a.run(26600:387)
Caused by: java.lang.IllegalMonitorStateException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at pl.ericpol.slee.ngin.bwlist.utils.CachedDataWrapper$RWLock.requestRead(CachedDataWrapper.java:453)
at pl.ericpol.slee.ngin.bwlist.utils.CachedDataWrapper.getMultivalueConfigurationParameter(CachedDataWrapper.java:181)
at pl.ericpol.slee.ngin.bwlist.utils.DBCache.getMultivalueConfigurationParameter(DBCache.java:197)
at pl.ericpol.slee.ngin.bwlist.callhandler.CallHandler.getSNFromBNumber(CallHandler.java:1060)
at pl.ericpol.slee.ngin.bwlist.callhandler.CallHandler.analyzeCalledPartyNumber(CallHandler.java:605)
at pl.ericpol.slee.ngin.bwlist.callhandler.CallHandler.handleCall(CallHandler.java:225)
at pl.ericpol.slee.ngin.bwlist.callhandler.CallHandler.checkNumbers(CallHandler.java:87)
at pl.ericpol.ngin.bwlist.traffic.jslee.rhino.cs1.prothandlers.BwlCS1IdleState.getServerOfCallAction(BwlCS1IdleState.java:220)
at pl.ericpol.ngin.bwlist.traffic.jslee.rhino.cs1.prothandlers.BwlCS1IdleState.getAction(BwlCS1IdleState.java:495)
at pl.ericpol.ngin.bwlist.traffic.jslee.rhino.cs1.prothandlers.BwlCS1IdleState.handleIDP(BwlCS1IdleState.java:105)
at pl.ericpol.ngin.bwlist.traffic.jslee.rhino.cs1.BWList.onInitialDP(BWList.java:346)
at com.opencloud.rhino.deployed.sbb.pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1.bwlist_sbb_1_0_SNAPSHOT.SbbOCBBBean.sbbDeliverEvent(SbbOCBBBean.java:327)
at com.opencloud.deployed.Service_Ericpol_BWlist_CS1_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOT_1.SBB_bwlist_sbb_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOTOCBB_Local.sbbDeliverEvent(SBB_bwlist_sbb_pl_ericpol_ngin_bwlist_traffic_jslee_rhino_cs1_1_0_SNAPSHOTOCBB_Local.java:1412)
... 7 more


and now, not all scenarion are matched:
print-status
Current status:
Sessions started: 2785773
Sessions active: 26
Session matches: 2785397
Session non-matches: 350
Non-match reasons: {MESSAGE_TIMED_OUT=1, SEND_ERROR=348, UNMATCHED_MESSAGE=1}
Last non-match details: SEND_ERROR: Failed to send protocol message
Non-match at message: EventReportBCSMRequest
Underlying error: com.opencloud.tools.scenario.simulator.protocol.ProtocolAdaptorException: Failed to send protocol message
Scenarios matched: 2785397
Sessions matched by name: {CS1-AnnouncPlayToCPN=214640, CS1-AnnouncPlayToCPN_AbandonAfterPA=214283, CS1-AnnouncPlayToCPN_AbandonBeforeCTR=213924, CS1-AnnouncPlayToCPN_PAError=214286, CS1-ContinueAction=214681, CS1-ContinueAction_Abandon=213724, CS1-NormalReleaseAction=214646, CS1-NormalReleaseAction_Abandon=213764, CS1-NormalReleaseInternationalCallAction=214290, CS1-ReroutingToCallCenter_AbandonAfter1stRRB=214011, CS1-ReroutingToCallCenter_AbandonAfter2ndRRB=214285, CS1-ReroutingToCallCenter_Answer=214576, CS1-ReroutingToCallCenter_CalledPartyBusy=214287}
Dialogs started: 2785773
Dialogs ended: 2785747
Dialogs active: 26
Dialogs rejected: 0


Can you explain if warning and exception are consequence of SBB logic or CGIN RA problem. We use:
Machine:
CPU: Intel(R) Pentium(R) Dual CPU E2140 @ 1.60GHz
RAM: 4GB
HDD: 1x standalone IDE HD
OS: Centos 5.5, 32-bit

Rhino: Rhino SDK 2.1,
CGIN: v.1.3.2.
Java: 1.6.20
Java HeapSize in Rhino config: 2048m
CPU load: ~40% (50 caps, ~30 active sessions)
matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

I think it's best to investigate the IllegalMonitorStateException first. Are you able to review what the SBB code is doing to produce this exception?

Caused by: java.lang.IllegalMonitorStateException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at pl.ericpol.slee.ngin.bwlist.utils.CachedDataWrapper$RWLock.requestRead(CachedDataWrapper.java:453)
at pl.ericpol.slee.ngin.bwlist.utils.CachedDataWrapper.getMultivalueConfigurationParameter(CachedDataWrapper.java:181)
at pl.ericpol.slee.ngin.bwlist.utils.DBCache.getMultivalueConfigurationParameter(DBCache.java:197)


I assume that it's trying to perform a wait() call on an object when it doesn't own the monitor for that object.
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

Hi,
We have investigated problem reported by exception 'java.lang.IllegalMonitorStateException' and now the source code of the SBB is changed and no exception appears more :)
But unfortunatally, warnings still appears on Simulator and Rhino platform. Below we can see in Simulator node two execution of the same CS1 scenario one by one by hand without any load execution on Rhino - first ended without any warning whilst second ended with warning:

> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"
> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
11:43:37,471 WARN : onContinue: Unable to find state for dtid=8685
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"


The problem is that, execution all our scenarios one by one finishing with successfull, but during execution them on load test we can observe that not all them are finished with success - we suppose that Warning generated here cause not matching scenarios during load test, because we can observe a lot of such warnings.
matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

Are you using the simulator from the same connectivity pack as the CGIN RA (1.3.2)?

At what call rates are you seeing those warnings?

Are you able to attach the scenario that you're testing?

There is a scalability limit in the scenario simulator which we intent to address in a future release; the warning messages are a common symptom when reaching that limit.

If you're simulating the side which initiates the call flow, one workaround is to use multiple instances of the simulator.
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

> Are you using the simulator from the same connectivity pack as the CGIN RA (1.3.2)?
Yes.

> At what call rates are you seeing those warnings?
Depends on hardware, but we can generalize, that at 10 calls attempts per second we can observe many warnings (1 warn. per 1 seconds), but sometime we can observe such warning during 2 calls attempts per second.

> Are you able to attach the scenario that you're testing?
We use many scenarios during load test, but we are sure scenario attached here 'Abandon event before CTR message scenario.' generate this type of warning.

> There is a scalability limit in the scenario simulator which we intent to address in a future release; the warning messages are a common symptom when reaching that limit.
Could you give us what is the value of 'scalability limit' - How many calls we can perform on second without reaching that limit?

> If you're simulating the side which initiates the call flow, one workaround is to use multiple instances of the simulator.
 Filename CS1-AnnouncPlayToCPN_AbandonBeforeCTR.scen [Disk] Download
 Description
 Filesize 2 Kbytes
 Downloaded:  2 time(s)

matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

> > At what call rates are you seeing those warnings?
> Depends on hardware, but we can generalize, that at 10 calls attempts per second we can observe many warnings (1 warn. per 1 seconds), but sometime we can observe such warning during 2 calls attempts per second.

That's not expected at all for a simple scenario like that at such a low call rate.

> > Are you able to attach the scenario that you're testing?
> We use many scenarios during load test, but we are sure scenario attached here 'Abandon event before CTR message scenario.' generate this type of warning.
>
> > There is a scalability limit in the scenario simulator which we intent to address in a future release; the warning messages are a common symptom when reaching that limit.
> Could you give us what is the value of 'scalability limit' - How many calls we can perform on second without reaching that limit?

It depends on your hardware, and how many messages are in the scenario. As a very rough guide (ball park figure): in a moderately complex scenario (e.g. 15-20 messages) on modest hardware, you could expect to get at least 100 cps before reaching its limit. We expect to improve this in future releases.

I don't think this is what's going on though. The scenario in this case is very simple -- I was able to push it in loopback mode (using simulated TCAP) to 200 cps without any such warnings running both sides on the same machine, a modest 3yr old laptop, also running on Java 1.6.

Are you using simulated TCAP, or via Signalware?

Is it possible that those failures coincide with the Rhino side closing the dialogs?

That warning message comes from the simulator TCAP layer, and suggests that a dialog end is received from the connection (northbound) closely after the dialog has been closed by the simulator (southbound). This in itself should be benign, but it suggests that Rhino side (perhaps the service) is closing some of these dialogs.

To investigate further, we'd need debug logging from the simulator (debug logging from the Rhino side would help too).
Debug logging from the simulator can be captured by replacing the "INFO" level settings with "DEBUG" at the bottom of the tools/log4j.properties.
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

Hi.
> Are you using simulated TCAP, or via Signalware?
We use TCAP simulator.

> To investigate further, we'd need debug logging from the simulator (debug logging from the Rhino side would help too).
> Debug logging from the simulator can be captured by replacing the "INFO" level settings with "DEBUG" at the bottom of the tools/log4j.properties.


I turned the log level to DEBUG for the simulator and repeated test which generates '10:51:47,807 WARN : onContinue: Unable to find state for dtid=3'.
This is the CAMEL call flow which I investigated from SBB traces:

I attached two logs 'simulator.log' (from TCAP simulator) and 'rhino.log' (from Rhino). To reproduce the WARN message I had to make 4 attempts before success:

> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"
> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"
> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"
> run-session CS1-AnnouncPlayToCPN_AbandonBeforeCTR
Send --> OpenRequest to endpoint2
Send --> InitialDPRequest to endpoint2
Send --> Delimiter to endpoint2
Recv <-- OpenAccept from endpoint2
Recv <-- RequestReportBCSMEventRequest from endpoint2
Recv <-- Delimiter from endpoint2
Send --> EventReportBCSMRequest to endpoint2
Send --> Close to endpoint2
10:51:47,807 WARN : onContinue: Unable to find state for dtid=3
Outcome of "CS1-AnnouncPlayToCPN_AbandonBeforeCTR" session: Matched scenario definition "CS1-AnnouncPlayToCPN_AbandonBeforeCTR"
matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

> I attached two logs 'simulator.log' (from TCAP simulator) and 'rhino.log' (from Rhino).

I don't think these are attached?
matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

Is the CTR message in that call flow sent to the SSP on a new dialog, or on the same dialog as the IDP,RRB, and ERB messages?

Such a warning message would make sense if it was sent on the same dialog, since the scenario closes the dialog when sending the ERB (Abandon).
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

> I don't think these are attached?
I am not able to attach log from simulator, because the size od it is larger then 1GB.

> Is the CTR message in that call flow sent to the SSP on a new dialog, or on the same dialog as the IDP,RRB, and ERB messages?
Yes, all INAP messages are sent on the same dialog. In this case, I can understand that WARN message is ok and no problem is here.
 Filename rhino.log [Disk] Download
 Description
 Filesize 68 Kbytes
 Downloaded:  1 time(s)

matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

Ok, yes it seems like the problem here is the mismatch between the call flow on the Rhino side and the call flow in the scenario.

We could improve the feedback on the simulator side here to make it more obvious what the underlying problem is.

For future reference regarding large log files, it's fine to attach a zip file or tarball of the log instead.
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

> Ok, yes it seems like the problem here is the mismatch between the call flow on the Rhino side and the call flow in the scenario.
> We could improve the feedback on the simulator side here to make it more obvious what the underlying problem is.


I have investigated once again traces and I realized that there is no exception reported by RA in call flow generating WARN message! In previous scenarios which ended without WARN message there is exception reported by RA - this is real call flow which generate WARN message in simulator:


> For future reference regarding large log files, it's fine to attach a zip file or tarball of the log instead.
Ok, I packed simultor.log to zip file.
 Filename rhino.log [Disk] Download
 Description Rhino log
 Filesize 68 Kbytes
 Downloaded:  0 time(s)

 Filename simulator.zip [Disk] Download
 Description Simulator log
 Filesize 12553 Kbytes
 Downloaded:  0 time(s)

matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

This looks like a race condition in the scenario between:
(1) The dialog close reaching Rhino, and
(2) The CTR and PA messages being sent by Rhino to the SSP

In the case of (1), when the close arrives to Rhino first, we can expect the CTR to fail, since the dialog has already been closed. (Hence an Exception on the Rhino side)

In the case of (2), the CTR and PA messages are written to the network before the TCAP stack receives the close (hence no exception). On the SSP side, we now see the CTR and PA messages after having already closed the dialog, hence the WARN messages complaining about an unknown dialog.

Does this differ from what you'd expect?
maol


Joined: 02/06/2010 03:19:42
Messages: 34
Offline

> On the SSP side, we now see the CTR and PA messages after having already closed the dialog, hence the WARN messages complaining about an unknown dialog.
Does this differ from what you'd expect?
This information confirms the belief that a WARN message appeared not due to incorrect implementation of the SBB component, but the race condition what is not a error.
Thanks for analyze and response...

I have question regarding the simulator possibilities. Situation is a shown below:

The problem is that, depending on race condition ('ERB(oAbandon)' or 'response from DB' first) the CS1 message flow is different which cause in Simulator node (SSP) "session: Session was not matched" in first case of flow. From function test's point view I want to test second case (when ERB(Abanond) appers before response from DB), but during Load test Simulator node is not able to send ERB message very fast which cause 'Session was not matched'. From SBB logic's point of view, the first and second case of call flow is correct and therefore the simulator should not generate "session: Session was not matched" warning in any case.
Is there any possibilities to omit ths problem, for example: "Use OR in Scenario definition to expect or not a CS1 message".
matt


Joined: 22/12/2008 11:20:39
Messages: 93
Offline

Is there any possibilities to omit ths problem, for example: "Use OR in Scenario definition to expect or not a CS1 message".


To achieve this, you'd model the two possible message flows as separate scenarios, e.g. CAN-Before-Abandon and Abandon-Before-CAN

In the Abandon-Before-CAN scenario, send the abandon after some delay, e.g.:

ERB-Abandon (DELAY 5000)
------------------->

In the CAN-Before-Abandon scenario:

CAN
<-----------------
CUE
<-----------------
ERB-Abandon
------------------->


Set the Abandon-Before-CAN as the preferred scenario using the set-preferred-scenario command.

This means that at runtime, the simulator will be expecting to send the abandon after the given delay. But if it receives the CAN and CUT messages before that time, it will switch to the CAN-Before-Abandon scenario.
 
Forum Index » Rhino SLEE Discussions
Go to:   
Powered by JForum 2.1.8 © JForum Team