Beruflich Dokumente
Kultur Dokumente
Abstract - Service-oriented architecture in the telecommunication industry is the first but huge step for
answering many challenges from management to fulfilling product timeline from marketing request. To be able
to implement service-oriented architecture, we need to define at least what technology we will use, design of
the system architecture, implementation strategy, and the roadmap itself. Last of all is how we manage this
established service-oriented system; monitoring services performance, lifecycle of services, risk management,
and so on. To maintain services performance at its best, we need to have good services capacity planning
in terms of high availability, throughput of the services, resources consumption. On this journey, services will
also evolve and expand. At this point, we also need to have good capacity planning of the platform, including
Enterprise Services Bus, Messaging Bus, and other supporting platform like database.
Introduction
Nowadays, the telecommunication industry has very tight competition on delivering the best quality of services
on short message, data, and subscription service to those services. Service-oriented architecture is the first
but huge step for answering this challenge to fulfill the high demand of the subscriber. To be able to implement
service-oriented architecture, there are a few things that we need to do. We need to define the technology we
will use, determine the design of the system architecture, and plan the implementation strategy. Choosing the
best fit technology is the first critical point to do. We need to have a set of criteria for evaluating the capability of
the technology itself which can satisfy our requirements. In the telecommunication domain, 5-9 high availability
is mandatory. Afterwards, we can then design the system architecture to fit our needs.
The last steps we need are on how we manage well-established SOA based system. Managing the
service-oriented system includes managing service availability, service performance, service lifecycle, risk
management, and so on. This step is important to keep the system stable. It must mitigate all the risks that
may occur in the future. As the number of subscribers keep growing every day, more transaction are loaded to
the system. This enables telecommunication providers to keep delivery services in its best performance with
high-availability feature. To maintain services performance at its best, we need to keep evaluating the service
capacity in terms of high availability, throughput, and resources consumption. Moreover, when services evolve
and expand, we also need to define the platform capacity itself including Enterprise Service Bus, Messaging
Bus, and other supporting platform like database.
Capacity planning of SOA-based system is a mandatory step to keep the system running on its best. It involves
two main activities which are capacity planning of the services, and capacity planning of the platform. Services
capacity planning is more on services sizing in horizontal view to be able to handle increasing incoming
transaction requests with allocated system resources. Platform capacity planning is more about sizing of
the platform capabilities to give system resources to all running services, including Enterprise Service Bus,
Messaging Bus, and other supporting platform like database. In this article, writers will discuss about these two
activities.
Measuring can be done through benchmarking and performance test in environment that most resemble like
production one.
For example, let us measure a service for purchasing Blackberry package registration products from the UMB
channel, and we will refer to this as service-X. Forecasted loads will be at 300 transactions per second (tps)
with service level agreement not more than 25 seconds. To do the benchmarking, we can give load test step-
by-step from 100 tps until y-tps where the services performance starts to degrade. As a sample, we have the
below performance test results:
1 100 1200
We can see that most response time is gained at 100 tps, which is our baseline. Performance starts to degrade
slightly when we give 200 tps loads to the service. Running service at 150 tps with 2 instances to handle 300
tps load can process faster than running 300 tps with 1 instance. On row no. 2, 1 transactions per second will
takes 8,33 ms to finished. For 300 transactions per second, it will needs only around 2500 ms. If we compare
with 300 tps with 1 instances it will need 4000 ms to complete. Within 4000 ms two instances of service at 150
tps can complete around 480 tps.
From this analysis, we can conclude that service-X can handle at most 150 transactions per second. And
running two service instances at 150 tps will give better results than running one instances at 300 tps.
have a performance test result of service-X as given in table 2 below for processing unit usage and we are
using 150 tps with 2 instances as we concluded from previous example.
1 100 1200
In this the production environment, when the service runs and reaches 150 tps on load, it will give 2.75%
extra for each instance on the platform processing unit. For example, the current condition of our platform
still uses only 40% of the processing unit (on peak period). So it still safe to run two instances of service-X
on the platform.
This baseline data can also be useful when we need to do projection planning. Projection planning is
important in making management decisions in regards to the expansion of the platform, both horizontally
and vertically. This way they can overcome future events (like Idoel Fitri, Christmas Eve, New Year, etc.).
2. Memory - is used by the services to store data when transactions run, and more is released when the
transaction is finished. In some cases memory leakage can also happen. Whenever memory leakage
happens, a service cannot release all the memory resources back to the platform. This is because of the
quality of service implementation code on object management. So, whenever we want to put a service in
production we need to make sure that all the services are free from memory leakage problem. This way it
will not disturb our production runtime environment.
Unlike the processing unit, to determine how much memory is needed by the services, we will need to
do an estimation from the services activity process itself. For example, figure 1 describes the five main
activities service-X contains.
In the first activity, the service is translating the msisdn and keyword as incoming request parameter into the
internal data structure. This internal data structure is called Request Payload. Request Payload consists of two
main parts. They are:
1. Header - a payload that defines the properties of every single message request. This part contains several
elements like RequestID, EndSystem, TimestampIn, TimestampOut, Channel, UUID, ESBUUID, and more.
Header part will be carried until the end of activity.
2. Body - the main payload of the request. The Body part varies on each service, and depends on the specific
internal data structure implementation. For example, in service-X the body payload consist of msisdn,
keyword, subscriberNo, and soccd element in the body part.
For example, the maximum header parts will contain 2,048 bytes, and body payload will contain 327 bytes. So,
we will have 2,375 bytes overhead.
In the next activity, service-X will load subscriber profile (subscriber number and soccd) from database based
on msisdn. Subscriber number and soccd will then be mapped with the request payload body. For example
subscriber number will have 32 bytes at maximum and soccd will have 16 bytes at maximum. Second actvitiy
will give additional 48 bytes on memory usage.
In third activity, service-X will register the subcriber number to the corresponding package based on the soccd
and keyword being input. Return value of this activity is only a boolean values, which takes 1 byte of data. So,
in this activity will give additional 1 byte on memory usage.
Unlike in the previous activities, the value of the fourth activity varies based on the package that the subcriber
has been registered to. But we can take the maximum value of it into take account. For example, to have a
good response message (which is defined by marketing team) we need 256 bytes. We can use this number as
a guideline. In the last activity, the response message will only be sent out through defined channels (UMB or
SMS).
If we sum all of the activities above, we will have the following estimation:
For single transaction service-X will at least need 12204 bytes of memory. As mentioned in the previous
example, if service-X will run at 150 tps per service instance, it will need at least 1,830,600 bytes of memory
(around 1,74 MB).
2. N+1 - means that there is one ESB that is becoming the secondary ESB for all primary ESBs. If there is
one ESB fails, then it should fall over to the secondary ESB. In this policy, the processing unit of primary
ESB should not exceed 100/N % capacity, since the secondary ESB has to hold N-primary ESB capacity.
Unlike the processing unit, the memory of the ESB is much more straightforward. We just need to have
available paging space for services to allocate memory. For example, we have 64 GB of memory and in
average our services in one primary ESB need 128 MB to runs at 150 tps. Our single primary ESB can serve
until 512 services runs at 150 tps. But, we should also put a threshold for memory, not utilizing it until 100%.
For example, if it was 60% memory utilized, we should put another memory unit on ESB.
# Persistent Size
# Hours # Load (tps)
(MB)
2 150 1,879,200
3 150 2,818,800
4 150 3,758,400
5 150 4,698,000
6 150 15,637,600
For example, we have requirement to keep business logs for 45 days for 150 tps loads. And one log record
contains at maximum 15.5 KB of data. We can do capacity estimation for the storage that we need by using
following formulation:
5% <0.05>
x 150
Estimated Peak TPS
x 3.600
x 12
12 Hours
= 324.000
Traffic 5%
10% <0.1>
x 150
x 3.600
x 4
4 Hours
= 216.00
Traffic 10%
50% <0.5>
x 150
x 3.600
x 4
4 Hours
= 1.080.000
Traffic 50%
80% <0.8>
x 150
x 3.600
x 2
2 Hours
= 864.000
Traffic 80%
100% <1>
x 150
x 3.600
x 2
2 Hours
= 1.080.000
Peak Traffic
DB Overhead 3
1.048.576
Data Retention 45
Conclusion
Database system/storage is not a mandatory component in SOA-based system, but it is very helpful for us to
see what is happening in our production environment. Most of it is for providing data to management in regards
to how much revenue is being produces by SOA-based system. By knowing this, they can make decisions on
whether it is worthy to put the service into a SOA-based system, considering all cost benefit.
http://www.servicetechmag.com/contributors/masykurmarhendra