Beruflich Dokumente
Kultur Dokumente
www.jaxenter.com
#46
Microservices
Are they for
everyone?
iStockphoto.com/VikaSuh
Editorial
Index
Holly Cummins
7
10
13
JEP 222
14
Werner Keil
16
Aviran Mordo
17
Database solutions
24
Colin Vipurs
27
29
Kris Beevers
Trade-offs in benchmarking
Cost, scope and focus
31
Aysylu Greenberg
32
Sheldon Smith
34
Ben Busse
38
Per Buer
Chris Neumann
Eric Horesnyi
Patricia Hines
Lyndsay Prewer
22
Zigmars Raascevskis
Hot or Not
Developer grief
Microservices
Microservices: Storm in
a teacup, or teacups in
a storm?
Somehow, the buzz surrounding microservices has us believing that every single employee and
enterprise must break up their monolith empires and follow the microservices trend. But its not
everyones cup of tea, says JAX London speaker Holly Cummins.
by Holly Cummins
Folks, we have reached a new phase on the Microservices
Hype Cycle. Discussion of the microservices hype has overtaken discussion of the actual microservices technology.
Were all talking about microservices, and were all talking
about how were all talking about microservices. This article
is, of course, contributing to that cycle. Shall we call it the
checkpoint of chatter?
Lets step back. I think were all now agreed on some basic
principles. Distributing congealed tea across lots of teacups
doesnt make it any more drinkable; microservices are not a
substitute for getting your codebase in order. Microservices
arent the right fit for everyone. On the other hand, microservices do encourage many good engineering practices, such as
clean interfaces, loose coupling, and high cohesion. They also
encourage practices that are a bit newer, but seem pretty sensible, such as scalability through statelessness and development quality through accountability (you write it, you make
it work in the field).
Many of these architectural practices are just good software
engineering. Youll get a benefit from adopting them but if
you havent already adopted them, will you be able to do that
along with a shift to microservices? A big part of the micro
services debate now centres on the best way to transition to
microservices. Should it be a big bang, or a gradual peeling
of services off the edge, or are microservices something which
should be reserved for greenfield projects?
Microservices
Advert
CD and CI
by Lyndsay Prewer
To paraphrase Wikipedia, Continuous Delivery is a software
engineering approach that produces valuable software in short
cycles and enables production releases to be made at any time.
Continuous Delivery is gaining recognition as a best practice,
but adopting it and iteratively improving it is challenging. Given the diversity of teams and architectures that do Continuous
Delivery well its clear that there is no single, golden path.
This article explores how two very different teams successfully practiced and improved Continuous Delivery. Both teams
were sizeable and mature in their use of agile and lean practices. One team chose microservices, Scala, MongoDB and
Docker on a greenfield project. The other faced the constraints
of a monolithic architecture, legacy code, .NET, MySQL and
Windows.
which, in turn, allowed the risk present in a software increment to be more easily identified.
Low cost deployment (and rollback): Once a release candidate
has been produced by the CI system, and the team is happy with
its level of risk, one or more deployments will take place, to a
variety of environments (normally QA, Staging/Pre-Production,
Production). When practicing Continuous Delivery, its typical
for these deployments to happen multiple times per week, if not
per day. A key success factor is thus to minimise the time and
effort of these deployments. The microservice team were able to
reduce this overhead down to minutes, which enabled multiple
deployments per day. The monolith team reduced it to hours, in
order to achieve weekly deployments.
Regardless of how frequent production deployments happen, the cost and impact of rolling back must be tiny (seconds), to minimise service downtime. This makes rolling back
pain-free and not a bad thing to do.
Monitoring and alerting: No matter how much testing
(manual or automated) a release candidate has, there is always a risk that something will break when it goes into Production. Both teams were able to monitor the impact of a
release in near real-time using tools such as Elastic Search,
Kibana, Papertrail, Splunk and NewRelic. Having such tools
easily available is great, but theyre next to useless unless people look at them, and they are coupled to automated alerting
(such as PagerDuty). This required a culture of caring about
Production, so that the whole team (not just Operations, QA
or Development) knew what normal looked like, and noticed when Productions vital signs took a turn for the worse.
Conclusion
This article has highlighted how different teams, with very
different architectures, both successfully practiced Continuous Delivery. Its touched on some of the shared patterns that
have enabled this. If youd like to hear more about how their
Continuous Delivery journey, including the different blockers and accelerators they faced, and the ever present impact
of Conways Law, then Ill be speaking on this topic at JAX
London on 13-14th October 2015.
Lyndsay Prewer is an Agile Delivery Lead, currently consulting for Equal Experts. He focuses on helping people, teams and products become even more
awesome, through the application of agile, lean and systemic practices. A
former rocket-scientist and software engineer, over the last two decades hes
helped ten companies in two hemispheres improve their delivery.
Architecture
by Eric Horesnyi
As developer or architect, we often have to communicate fine
concepts of network or system architecture to decision-makers. In my case, I have been using the smart city analogy for
twenty years. And to celebrate the 25th birthday of the Web, I
propose to draw an analogy in depth between a designed city,
Paris and the Web. Going through Fieldings thesis, we will
compare Paris to the Web in terms of constraints, inherited
features, architectural style choices and finally assess whether
these choices meet the objectives. All these with a focus on a
transformational period of Paris: 18531869 under Haussmann as Architect, with an approach worth applying to the
many large corporate information systems looking to adopt
a microservice style, as proposed by Fowler and Newman.
Here are the first two episodes out of seven that we will cover
during the session at the upcoming JAX London. Our audience
is, by design, either software experts interested in city architecture to illustrate the beauty of HTTP Rest and Continuous
Delivery, or anybody wanting to understand the success of the
Web, and get acquainted to key web lingos and concepts.
Outside was even worse: ridden with cybercrime (you get it)
in obscure streets, slow with narrow access lines (streets) and
non-existent backbones (boulevards), without a shared protocol for polling traffic (sidewalk for pedestrians) or streaming
(subway), and no garbage collection (gotcha). Worse, when
a user would go from one page to another, TLP was terrible because of these congested un-protocoled lines, but they
would come home with viruses by lack of continuous
delivery of patches to the
servers (air circulation and
sun down these narrow
streets hidden by overly
high buildings).
To top it off, service
was fully interrupted
and redesigned regularly (Revolutions in 1789,
1815, three Glorieuses
in 1830, 1848), without backward compatibility. Although users
would benefit from
Eugne Haussmann
vers, crowded
r
e
s
d
e
t
s
e
g
n
o
C
Paris
in
s
t
n
e
m
t
r
a
p
a
7
Architecture
Pre-Haussmannian
unsafe, ridden with virustreet in Paris,
ses, no cooling
these changes in the long run, they definitely did not appreciate the long period of adaptation to the new UI, not to mention calls and escalations to a non-existent service desk (votes
for poor and women). Well, actually, these small access lines
made it easy to build DDOS attacks (barricades), a feature the
business people did not like (Napoleon III).
First P
age
Civil,
of Code
1804
Architecture
Eric Horesnyi was a founding team member at Internet Way (French B2B
ISP, sold to UUNET) then Radianz (Global Finance Cloud, sold to BT). He
is a High Frequency Trading infrastructure expert, passionate about Fintech and Cleantech. Eric looks after 3 bozons and has worked in San
Francisco, NYC, Mexico and now Paris.
istockphoto.com/Viorika
Java
by Angelika Langer
Java 8 came with a major addition to the JDK collection
framework, namely the Stream API. Similar to collections,
streams represent sequences of elements. Collections support
operations such as add(), remove(), and contains() that work
on a single element. Streams, in contrast, have bulk operations such as forEach(), filter(), map(), andreduce() that access all elements in a sequence. The notion of a Java stream
is inspired by functional programming languages, where
the corresponding abstraction is typically called a sequence,
which also has filter-map-reduce operations. Due to this similarity, Java 8 at least to some extent permits a functional
programming style in addition to the object-oriented paradigm that it supported all along.
Perhaps contrary to widespread belief, the designers of the
Java programming language did not extend Java and its JDK
to allow functional programming in Java or to turn Java into
a hybrid object-oriented and functional programming language. The actual motivation for inventing streams for Java
was performance or more precisely making parallelism
more accessible to software developers (see Brian Goetz, State
of the Lambda). This goal makes a lot of sense to me, considering the way in which hardware evolves. Our hardware has
dozens of CPU cores today and will probably have hundreds
some time in the future. In order to effectively utilize the hardware capabilities and thereby achieve state-of-the-art execution performance we must parallelize. After all what is the
point to running a single thread on a multicore platform? At
the same time, multithread programming is considered hard
and error-prone, and rightly so. Streams, which come in two
flavours (as sequential and parallel streams), are designed
to hide the complexity of running multiple threads. Parallel
streams make it extremely easy to execute bulk operations in
parallel magically, effortlessly, and in a way that is accessible to every Java developer.
10
Java
We measured on an outdated hardware (dual core, no dynamic overclocking) with proper warm-up and all it takes
to produce halfway reliable benchmark figures. This was the
result in that particular context:
int-array, for-loop : 0.36 ms
int-array, seq. stream: 5.35 ms
Again, the for-loop is faster that the sequential stream operation, but the difference on an ArrayList is not nearly as
significant as it was on an array. Lets think about it. Why
do the results differ that much? There are several aspects to
consider.
First, access to array elements is very fast. It is an indexbased memory access with no overhead whatsoever. In other
words, it is plain down-to-the-metal memory access. Elements in a collection such as ArrayList on the other hand
are accessed via an iterator and the iterator inevitably
adds overhead. Plus, there is the overhead of boxing and
unboxing collection elements whereas int-arrays use plain
primitive type ints. Essentially, the measurements for the ArrayList are dominated by the iteration and boxing overhead
whereas the figures for the int-array illustrate the advantage
of for-loops.
Secondly, had we seriously expected that streams would
be faster than plain for-loops? Compilers have 40+ years of
experience optimizing loops and the virtual machines JIT
compiler is especially apt to optimize for-loops over arrays
with an equal stride like the one in our benchmark. Streams
on the other hand are a very recent addition to Java and the
JIT compiler does not (yet) perform any particularly sophisticated optimizations to them.
Thirdly, we must keep in mind that we are not doing much
with the sequence elements once we got hold of them. We
spend a lot of effort trying to get access to an element and
then we dont do much with it. We just compare two integers,
which after JIT compilation is barely more than one assem-
11
Java
The reality check via our benchmark yields a ratio (sequential/parallel) of only 1.6 instead of 2.0, which illustrates the
amount of overhead that is involved in going parallel and
how (well or poorly) it is overcompensated (on this particular
platform).
You might be tempted to generalise these figures and conclude that parallel streams are always faster than sequential
streams, perhaps not twice as fast (on a dual core hardware),
as one might hope for, but at least faster. However, this is not
true. Again, there are numerous aspects that contribute to the
performance of a parallel stream operation.
One of them is the splittability of the stream source. An
array splits nicely; it just takes an index calculation to figure
out the mid element and split the array into halves. There is
no overhead and thus barely any cost of splitting. How easily
do collections split compared to an array? What does it take
to split a binary tree or a linked list? In certain situations you
will observe vastly different performance results for different
types of collections.
Another aspect is statefulness. Some stream operations
maintain state. An example is the distinct() operation. It is
an intermediate operation that eliminates duplicates from
the input sequence, i.e., it returns an output sequence with
distinct elements. In order to decide whether the next element is a duplicate or not the operation must compare to
all elements it has already encountered. For this purpose it
maintains some sort of data structure as its state. If you call
distinct() on a parallel stream its state will be accessed concurrently by multiple worker threads, which requires some
form of coordination or synchronisation, which adds overhead, which slows down parallel execution, up to the extent
that parallel execution may be significantly slower than sequential execution.
With this in mind it is fair to say that the performance
model of streams is not a trivial one. Expecting that parallel
stream operations are always faster than sequential stream
operations is naive. The performance gain, if any, depends on
numerous factors, some of which I briefly mentioned above.
If you are familiar with the inner workings of streams you
will be capable of coming up with an informed guess regarding the performance of a parallel stream operation. Yet, you
need to benchmark a lot in order to find out for a given context whether going parallel is worth doing or not. There are
indeed situations in which parallel execution is slower than
sequential execution and blindly using parallel streams in all
cases can be downright counter-productive.
The realisation is: Yes, parallel stream operations are easy
to use and often they run faster than sequential operations,
but dont expect miracles. Also, dont guess; instead, benchmark a lot.
Angelika Langer works as a trainer and consultant with a course curriculum of Java and C++ seminars. She enjoys speaking at conferences,
among them JavaOne, JAX, JFokus, JavaZone andmany more. She is author of the online Java Generics FAQs and a Lambda Tutorial & Reference at www.AngelikaLanger.com.
12
Java
by Geertjan Wielenga
We can no longer make assumptions about where and how
the applications we develop will be used. Where originally
HTML, CSS and JavaScript were primarily focused on presenting documents in a nice and friendly way, the utility of
the browser has exploded beyond what could ever have been
imagined. And, no, its not all about multimedia i.e., no,
its not all about video and audio and the like. Its all about
full-blown applications that can now be programmed for the
browser. Why the browser? Because the browser is everywhere: on your mobile device, on your tablet, on your laptop,
and on your desktop computer.
Seen from the perspective of the Java ecosystem, this development is a bit of a blow. All along, we thought the JVM
would be victorious, i.e., we thought the write once, run
anywhere mantra would be exclusively something that we
as Java developers could claim to be our terrain. To various
extents, of course, thats still true, especially if you see Android as Java for mobile. Then you could make the argument
that on all devices, some semblance of Java is present. The
arguments youd need to make would be slightly complicated
by the fact that most of your users dont actually have Java
installed i.e., they physically need to do so, or your application needs to somehow physically install Java on your users
device. Whether youre a Java enthusiast or not, you need to
admit that the reach of the browser is far broader and more
intuitively present than Java, at this point.
So, how do we deal with this reality? How can you make
sure that your next application supports all these different
devices, which each have their own specificities and eccentricities? On the simplest level, each device has its own screen
size. On a more complex level, not every device needs to enable interaction with your application in the same way. Some
13
istockphoto.com/dinn
Java
JEP 222
Among the few truly new features coming in Java 9 (alongside Project Jigsaws modularity) is a Java Shell that has recently been confirmed. Java Executive Committee member
Werner Keil explains how Javas new REPL got started and what its good for.
by Werner Keil
As proposed in OpenJDK JEP 222 [1], the JShell offers a
REPL (Read-Eval-Print Loop) to evaluate declarations, statements and expressions of the Java language, together with an
API allowing other applications to leverage its functionality.
The idea is not exactly new. BeanShell [2] has existed for over
15 years now, nearly as long as Java itself, not to mention
many scripting languages on Scala and Groovy also featuring
similar shells already.
BeanShell (just like Groovy, too by the way) made an attempt of standardisation by the Java Community Process [3]
in JSR 274 a JSR that did not produce any notable output,
in spite of the fact that (or perhaps because?) two major companies, Sun and Google, had joined the expert group. Under
the JCP.next initiative this JSR was declared Dormant.
An eyebrow-raising approach
Adding a new Java feature like this via JEP, rather than waking up the Dormant JSR (which anyone could, including
Figure 1: JShell
arithmetic
14
Java
Frink provides much more mathematical and physical formulas, including unit conversion. Based on JSR 363, the
upcoming Java Units of Measurement standard [11], this
will be possible in a similar way. With Groovy, co-founder
Guillaume Laforge has documented a DSL/REPL for Units
of Measurements using JSR 275 a while back [12]. Their solution was used in real-life medical research for Malaria treatments. Of course, being written in Java, someone might also
simply expose the actual Frink language and system via JShell
under Java 9!
Werner Keil is an Agile Coach, Java EE and IoT/Embedded/Real Time
expert. Helping Global 500 enterprises across industries and leading IT
vendors, he has worked for over 25 years as Program Manager, Coach,
SW architect and consultant for Finance, Mobile, Media, Tansport and
Public sectors. Werner is an Eclipse and Apache Committer and JCP member in JSRs like 333 (JCR), 342 (Java EE 7), 354 (Money), 358/364 (JCP.next), Java
ME 8, 362 (Portlet 3), 363 (Units, also Spec Lead), 365 (CDI 2), 375 (Java EE Security) and in the Executive Committee.
References
[1] http://openjdk.java.net/jeps/222
[2] http://www.beanshell.org/
[3] http://jcp.org
[4] http://openjdk.java.net/projects/dio/
[5] https://en.wikipedia.org/wiki/Windows_PowerShell
[6] http://teiid.jboss.org/tools/adminshell/
[7] http://blog.takipi.com/java-9-early-access-a-hands-on-session-with-jshell-thejava-repl/
[8] https://en.wikipedia.org/wiki/Q%26A_(Symantec)
[9] http://www.javamoney.org
[10] https://futureboy.us/frinkdocs/
[11] http://unitsofmeasurement.github.io/
[12] https://dzone.com/articles/domain-specific-language-unit-
15
Databases
by Aviran Mordo
NoSQL is a set of database technologies built to handle massive
amounts of data or specific data structures foreign to relational
databases. However, the choice to use a NoSQL database is
often based on hype, or a wrong assumption that relational
databases cannot perform as well as a NoSQL database. Operational cost is often overlooked by engineers when it comes
to selecting a database.
When building a scalable system, we found that an important factor is using proven technology so that we know how
to recover fast if theres a failure. Pre-existing knowledge and
experience with the system and its workings as well as being
able to Google for answers is critical for swift mitigation.
Relational databases have been around for over 40 years, and
there is a vast industry knowledge of how to use and maintain
them. This is one reason we usually default to using a MySQL
database instead of a NoSQL database, unless NoSQL is a
significantly better solution to the problem.
However, using MySQL in a large-scale system may have performance challenges. To get great performance from MySQL,
we employ a few usage patterns. One of these is avoiding database-level transactions. Transactions require that the database
maintains locks, which has an adverse effect on performance.
Instead, we use logical application-level transactions, thus
reducing the load and extracting high performance from the
database. For example, lets think about an invoicing schema.
If theres an invoice with multiple line items, instead of writing
all the line items in a single transaction, we simply write line
by line without any transaction. Once all the lines are written
to the database, we write a header record, which has pointers
to the line items IDs. This way, if something fails while writing the individual lines to the database, and the header record
was not written, then the whole transaction fails. A possible
downside is that there may be orphan rows in the database.
We dont see it as a significant issue though, as storage is cheap
and these rows can be purged later if more space is needed.
16
Data
Business intelligence
must evolve
Every employee and every end user should have the right to find answers using data
analytics. But the current reliance on IT for key information is creating an unnecessary
bottleneck, says DataHeros Chris Neumann.
by Chris Neumann
Self-service is a term that gets used a lot in the business intelligence (BI) space these days. In reality, data analytics has
largely ignored the group of users that really need self service,
even as that user base has grown. More than ever people realize
the value of data, but non-technical users are still left out of the
conversation. While everything from storage to collaboration
tools have become simple enough for anyone to download and
begin using, BI and data analytics tools still require end users
to be experts or seek the help of experts. That needs to change.
Users should be able to get up and running on data analytics and connect to the services they use most, easily. More employees in every department are expected to make decisions
based on their data, but that doesnt mean everyone needs
to be a data analyst or data scientist. Business users want to
analyse data that lives in the services they use everyday, like
Google Analytics, HubSpot, Marketo, and Shopify and even
Excel, and know the questions they need answered. What
they need are truly self-service tools to get those answers.
The consumerization of BI
BI and data analytics have largely missed the consumerization
of IT trend, despite industry-wide use of the term self service.
That doesnt mean that change isnt coming. The shift to the
cloud is continuing to accelerate and the emerging self-service
cloud BI space is quickly heating up, driven by user demand
and a need to decouple analytics from IT.
Chris Neumann is the founder and Chief Product Officer of DataHero,
where he aims to help everyone unmask the clues in their data. Previously he was the first employee at Aster Data Systems and describes
himself as a data-analytics junkie, a bona fide techie and a self-proclaimed
foodie.
17
2010
e
c
n
I
s
In london
save 3
0%
@JAXLondon
JAX London
JAX London
www.jaxlondon.com
Presented by
Organized by
Keynotes
VC from the inside a techies perspective
After many years in CTO roles with SpringSource, VMware, and Pivotal,
and having experienced what it is like to work in a VCbacked company,
in June of 2014 Adrian switched sides and joined the venture capital firm
Accel Partners in London. So what exactly does a technologist do inside
a venture capital firm? And having been part of the process from the
inside, how do investment decisions get made? In this talk Adrian will
share some of the lessons hes learned since embedding in the world of
venture capital, and how you can maximise your chances of investment
and a successful companybuilding partnership.
From Design Thinking to Devops and Back Again: unifying Design and operations
In this new talk, I will share some stories of changes the teams I work
with have made and explain some mechanisms that we applied to make
changes. Teams I work with at Unruly use eXtreme Programming (XP)
techniques to build our systems. Modern XP has many counterintuitive
practices such as mob and pair programming. How did new ways of
seeing old problems help us resolved them?
Come along to this talk to hear about some practical techniques you can
use to help solve tricky problems and get others on board with your idea
by shifting perspective.
jaxlondon.com
Timetable
Monday October 12th
09:00 17:00
James Lewis
Jeff Sussna
Sandro Mancuso
Peter Lawery
KeYNOTe: From Design Thinking to DevOps and Back Again: Unifying Design and
Operations
Benchmarking: Youre Doing It Wrong
Jeff Sussna
Aysylu Greenberg
Angelika Langer
Open Source workflows with BPMN 2.0, Java and Camunda BPM
Niall Deehan
Vinita Rathi
11:40 12:10
11:40 12:30
Lyndsay Prewer
14:30 15:20
Lukas Eder
Aviran Mordo
Kasia Mrowca
Jessica Rose
Jonathan Gallimore
Tim Berglund
Angelika Langer
Jeremy Deane
Paul Stack
Per Buer
Samir Talwar
Chris Richardson
15:50 16:40
17:10 18:00
18:15 18:45
All Change! How the new Economics of Cloud will make you think differently about Java Steve Poole, Chris Bailey
Le Mort du Product Management
Nigel Runnels-Moss
20:00 21:00
Adrian Colyer
Rachel Davies
10:00 10:50
Aviran Mordo
Architectural Resiliency
Jeremy Deane
Tim Berglund
Lambdas Puzzler
Peter Lawrey
Geertjan Wielenga
Benjamin Stopford
Sandro Mancuso
Eric Horesnyi
Chris Bailey
Nigel Runnels-Moss
Colin Vipurs
Chris Richardson
John Davies
11:20 12:10
12:20 13:10
15:30 16:20
jaxlondon.com
jaxlondon.com
istockphoto.com/Peter Booth
Cloud
Database solutions
by Zigmars Raascevskis
Cloud computing engines today allow businesses to easily extend their IT infrastructure at any time. This means that you
can rent servers with only a few clicks, and various software
stacks including web-servers, middleware and databases can
be installed and run on to these server instances with little-tono effort. With data continuing to aggregate at a rapid speed,
the database is becoming a large part of this infrastructure.
By leveraging conventional cloud computing, every business
can run its own database stack in cloud the same way as if it
were on-premise.
Theres still a huge amount of potential to accelerate speed
and efficiency by using a multi-tenant database. For multi-
22
Cloud
Distributed databases can serve as a solid foundation for distributed computing that is massively parallel and instantly scalable.
I Failure tolerance of distributed systems
By design, distributed systems with state replication are resis
tant against most forms of single machine failures. Guarding
against single machine hardware failures is relatively straightforward. With the distributed database design, every database
is hosted on multiple machines that replicate each partition
several times. Therefore, in the case of server failure, each system routes traffic to healthy replicas to make sure that data is
replicated elsewhere ensuring higher availability. However,
making distributed systems tolerant against software failures
is much more difficult due to common cause and presents a
difficult challenge. The ultimate power of distributed systems
comes from parallelism, but this also means that the same
code is executed on every server participating in fulfilling the
request. If working on a particular request causes a fatal failure that has a negative impact on the operation of a system
or even crashes it, this means the entire cluster is immediately
affected.
Sophisticated methods are necessary to avoid such correlated failures, which might be rare, that have devastating effects.
One method involves trying each query on a few isolated
computational nodes before sending it down to the entire
cluster with massive parallelism. Once failures are observed
in the sandbox, suspicious requests are immediately quarantined and isolated from the rest of the system.
23
Tests
by Colin Vipurs
Over my many years of software development Ive had to perform various levels of testing against many different database
instances and types including RDBMS and NoSQL, and one
thing remains constant its hard. There are a few approaches
that can be taken when testing the database layer of your
code and Id like to go over a few of them pointing out the
strengths and weaknesses of each.
Mocking
It may sound silly, but the best way to verify that your database interaction code works as expected is to actually have
Listing 1
@Test
public void testJdbc() {
final Connection connection = context.mock(Connection.class);
final ResultSet resultSet = context.mock(ResultSet.class);
final PreparedStatement preparedStatement = context.mock(PreparedStatement.class);
final States query = context.states("query").startsAs("pre-prepare");
context.checking(new Expectations() {{
oneOf(connection).prepareStatement("SELECT firstname, lastname, occupation FROM users");
then(query.is("prepared"));
will(returnValue(preparedStatement));
oneOf(preparedStatement).executeQuery();
oneOf(resultSet).next(); when(query.is("executed")); then(query.is("available"));
oneOf(resultSet).getString(1); when(query.is("available")); will(returnValue("Hermes"));
oneOf(resultSet).getString(2); when(query.is("available")); will(returnValue("Conrad"));
oneOf(resultSet).getString(3); when(query.is("available")); will(returnValue("Bureaucrat"));
oneOf(resultSet).close(); when(query.is("available"));
oneOf(preparedStatement).close(); when(query.is("available"));
}});
}
24
Tests
Listing 2
{code}
public class EmbeddedDatabaseTest {
private DataSource dataSource;
@Before
public void createDatabase() {
dataSource = new EmbeddedDatabaseBuilder().
setType(EmbeddedDatabaseType.H2).
addScript("schema.sql").
addScript("test-data.sql").
build();
}
@Test
public void aTestRequiringADataSource() {
// execute code using DataSource
}
}
youre using a different DataSource to your production instance it can be easy to miss configuration options required
to make the Driver operate correctly. Recently I came across
such a setup where H2 was configured to use a DATETIME
column requiring millisecond precision. The same schema
definition was used on a production MySQL instance which
not only required this to be DATETIME(3) but also needs the
useFractionalSeconds=true parameter provided to the driver.
This issue was only spotted after the tests were migrated from
using H2 to a real MySQL instance.
Real Databases: Where possible I would highly recommend
testing against a database thats as close as possible to the one
being run in your production environment. A variety of factors
can make this difficult or even impossible, such as commercial
databases requiring a license fee meaning that installing on
each and every developer machine is prohibitively costly. A
classic way to get around this problem is to have a single development database available for everyone to connect to. This in
itself can cause a different set of problems, not least of which is
performance (these always seem to get installed on the cheapest and oldest hardware) and test repeatability. The issue with
sharing a database with other developers is that two or more
people running the tests at the same time can lead to inconsistent results and data shifting in an unexpected way. As the
number of people using the database grows, this problem gets
worse throw the CI server into the mix and you can waste a
lot of time re-running tests and trying to find out if anyone else
is running tests right now in order to get a clean build.
If youre running a free database such as MySQL or one
of the many free NoSQL options, installing on your local
development machine can still be problematic issues such
as needing to run multiple versions concurrently or keeping
everyone informed of exactly what infrastructure needs to be
up and what ports they need to be bound to. This model also
requires the software to be up and running prior to performing a build making onboarding staff onto a new project more
time consuming than it needs to be.
Thankfully over the last few years several tools have appeared to ease this, the most notable being Vagrant and Docker. As an example, spinning up a local version of MySQL in
Docker can be as easy as issuing the following command:
$ docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=bob mysql
25
Tests
Listing 3
<dataset>
<USER FIRST_NAME="John"
SURNAME="Smith"
DOB="19750629"/>
<USER FIRST_NAME="Jane"
SURNAME="Doe"
DOB="19780222"/>
</dataset>
Listing 4
def "existing user can be read"() {
given:
sql.execute('INSERT INTO users (id, name) VALUES (1234, "John Smith")')
when:
def actualUser = users.findById(1234)
then:
actualUser.id == 1234
actualUser.name == 'John Smith'
}
Listing 5
def "new user can be stored"() {
given:
def newUser = new User(1234, "John Smith")
when:
users.save(newUser)
then
def actualUser = users.findById(1234)
actualUser.id == 1234
actualUser.name == 'John Smith'
}
Colin Vipurs started professional software development in 1998 and released his first production bug shortly after. He has spent his career working in a variety of industries using a wide range of technologies always
attempting to release bug-free code. He holds a MSc from Liverpool University and currently works at Shazam as a Developer/Evangelist. He has
spoken at numerous conferences worldwide.
26
Finance IT
Financial services PaaS and private clouds: Managing and monitoring disparate
environments
by Patricia Hines
Financial Institutions (FIs) find that deploying PaaS and IaaS
solutions within a private cloud environment is an attractive
alternative to technology silos created by disparate server
hardware, operating systems, applications and application
programming interfaces (APIs). Private cloud deployments
enable firms to take a software-defined approach to scaling
and provisioning hardware and computing resources.
While other industries have long enjoyed the increased
agility, improved business responsiveness and dramatic cost
savings by shifting workloads to public clouds, many firms
in highly regulated industries like financial services, healthcare and government are reluctant to adopt public cloud. As
a result of increased regulatory and compliance scrutiny for
these firms, the potential risks of moving workloads to public
clouds outweigh any potential savings.
27
Finance IT
Figure 1: MuleSoft
Even if a firm decides to eventually re-architect legacy applications for private PaaS hosting or move workloads across
multiple PaaS solutions, it is critical that organizations develop an overarching connectivity strategy to seamlessly tie
together systems, data and workflow that accommodates a
long-term migration journey. In order for the organization to
achieve a single pane of glass for managing and monitoring, organizations need the ability to connect and integrate
the various environments and enable service discovery, naming, routing, and rollback for SOAP web services, REST APIs,
microservices and data sources.
28
istockphoto.com/retrorocket
Web
by Kris Beevers
Internet based applications are built markedly different today than they were even just a few years ago. Application
architectures are largely molded by capabilities of the infrastructure and core services upon which the applications are
built. In recent years weve seen tectonic shifts in the ways
infrastructure is consumed, code is deployed and data is
managed.
A decade ago, most online properties lived on physical
infrastructure in co-location environments, with dedicated
connectivity and big-iron database back ends, managed by
29
Web
30
Benchmarks
Trade-offs in benchmarking
Is it quality youre looking to improve? Or performance? Before you decide on what kind of a benchmark your system needs, you need to know the spectrum of cost and benefit trade-offs.
by Aysylu Greenberg
Benchmarking software is an important step in maturing a system. It is best to benchmark a system after correctness, usability, and reliability concerns have been addressed. In the typical
lifetime of a system, emphasis is first placed on correctness of
implementation, which is verified by unit, functional, and integration tests. Later, the emphasis is placed on the reliability
and usability of the system, which is confirmed by the monitoring and alerting setup of a system running in production for
an extended period of time. At this point, the system is fully
functional, produces correct results, and has the necessary set
of features to be useful to the end client. At this stage, benchmarking the software helps us to gain a better understanding
of what improvement work is necessary to help the system gain
a competitive edge.
There are two types of benchmarks one can create performance and quality. Performance benchmarks generally measure latency and throughput. In other words, they answer the
questions: How fast can the system answer a query?, How
many queries per second can it handle?, and How many
concurrent queries can the system handle? Quality benchmarks, on the other hand, address domain specific concerns,
and do not translate well from one system to another. For
instance, on a news website, a quality benchmark could be the
total number of clicks, comments, and shares on each article.
In contrast, a different website may include not only those
properties but also what the users clicked on. This might happen because the websites revenue is dependent on the number
of referrals, rather than how engaging a particular article was.
Benchmarking:
Youre Doing It Wrong
Hear Aysylu Greenberg speak at the JAX London: Knowledge of
how to set up good benchmarks is invaluable in understanding
performance of the system. Writing correct and useful benchmarks
is hard, and verification of the results is difficult and prone to
errors. When done right, benchmarks guide teams to improve the
performance of their systems. In this talk, we will discuss what you
need to know to write better benchmarks.
31
istockphoto.com/vladru
istockphoto.com/YanC
Security
Common threats to
your VoIP system
VoIP remains a popular system for telephone communication in the enterprise. But have
you ever considered the security holes this system is leaving you open to? And what company secrets are at risk of eavesdropping, denial of service and Vishing attacks?
by Sheldon Smith
I Transmission issues
32
Security
II Denial of service
The next security risk inherent to VoIP? Attacks intended to
slow down or shut down your voice network for a period
of time. As noted by a SANS Institute whitepaper, malicious
attacks on VoIP systems can happen in a number of ways.
First, your network may be targeted by a denial of service
(DOS) flood, which overwhelms the system. Hackers may
also choose buffer overflow attacks or infect the system with
worms and viruses in attempt to cause damage or prevent
your VoIP service from being accessed. As noted by a recent
CBR article, VoIP attacks are rapidly becoming a popular
avenue for malicious actors UK-based Nettitude said that
within minutes of bringing a new VoIP server online, attack
volumes increased dramatically.
Dealing with these threats means undertaking a security
audit of your network before adding VoIP. Look for insecure
endpoints, third-party applications and physical devices that
may serve as jumping-off points for attackers to find their
way into your system. This is also a good time to assess legacy apps and older hardware to determine if theyre able to
handle the security requirements of internet-based telephony.
Its also worth taking a hard look at any network protection
protocols and firewalls to determine if changes must be made.
Best bet? Find an experienced VoIP provider who can help
you assess existing security protocols.
formation including phone numbers, account PINs and users personal data. Impersonation is also possible hackers
can leverage your VoIP system to make calls and pose as a
member of your company. Worst case scenario? Customers
and partners are tricked into handing over confidential information.
Handling this security threat means developing policies and
procedures that speak to the nature of the problem. IT departments must regularly review who has access to the VoIP system and how far this access extends. In addition, its critical
to log and review all incoming and outgoing calls.
IV Vishing
According to the Government of Canadas Get Cyber Safe
website, another emerging VoIP threat is voice phising or
vishing. This occurs when malicious actors redirect legitimate calls to or from your VoIP network and instead
connect them to online predators. From the perspective of
an employee or customer the call seems legitimate and they
may be convinced to provide credit card or other information. Spam over Internet Telephony (SPIT) is also a growing
problem; here, hackers use your network to send thousands
of voice messages to unsuspecting phone numbers, damaging your reputation and consuming your VoIP transmission capacity. To manage this issue, consider installing a
separate, dedicated internet connection for your VoIP alone,
allowing you to easily monitor traffic apart from other internet sources.
V Call fraud
The last VoIP risk comes from the call fraud, also called toll
fraud. This occurs when hackers leverage your network to
make large volume and lengthy calls to long-distance or premium numbers, resulting in massive costs to your company.
In cases of toll fraud, meanwhile, calls are placed to revenuegenerating numbers such as international toll numbers
which generate income for attackers and leave you with the
bill.
Call monitoring forms part of the solution here, but its also
critical to develop a plan that sees your VoIP network regularly patched with the latest security updates. Either create a
recurring patch schedule or find a VoIP provider that automatically updates your network when new security updates
become available.
VoIP systems remain popular thanks to their ease-of-use,
agility and global reach. Theyre not immune to security issues but awareness of common threats coupled with proactive IT efforts helps you stay safely connected.
III Eavesdropping
Another issue for VoIP systems is eavesdropping. If your traffic is sent unencrypted, for example, its possible for motivated attackers to listen in on any call made. The same goes
for former employees who havent been properly removed
from the VoIP system or had their login privileges revoked.
Eavesdropping allows malicious actors to steal classified in-
33
REST
by Ben Busse
Where I work at DreamFactory, we designed and built some
of the very first applications that used web services on Salesforce.com, AWS and Azure. Over the course of ten years,
we learned many painful lessons trying to create the perfect
RESTful backend for our portfolio of enterprise applications.
When a company decides to start a new application project,
the business team first defines the business requirements
and then a development team builds the actual software. Usually there is a client-side team that designs the application
and a server-side team that builds the backend infrastructure.
These two teams must work together to develop a REST API
that connects the backend data sources to the client application.
One of the most laborious aspects of the development process is the interface negotiation that occurs between these
two teams (Figure1). Project scope and functional requirements often change throughout the project, affecting API and
integration requirements. The required collaboration is complex and encumbers the project.
34
REST
The core mistake with the API dungeon is that development activity starts with business requirements and application design, and then works its way back to server-side
data sources and software development. This is the wrong
direction.
The best approach is to identify the data sources that need
to be API-enabled and then create a comprehensive and reusable REST API platform that supports general-purpose application development (Figure 3).
There are huge benefits to adopting a reusable REST API
strategy.
new app with various developers, consultants and contractors. The result is custom, one-off APIs that are highly fragmented, fragile, hard to centrally manage and often insecure.
The API dungeon is an ugly maze of complexity (Figure2).
Custom, manually coded REST APIs for every new application project, written with different tools and developer
frameworks.
REST APIs are hardwired to different databases and file
storage systems.
REST APIs run on different servers or in the cloud.
REST APIs have different security mechanisms, credential
strategies, user management systems and API parameter
names.
Data access rights are confused, user management is complex and application deployment is cumbersome.
The system is difficult to manage, impossible to scale and
full of security holes.
API documentation is often non-existent. Often, companies cant define what all the services do, or even where all
of the endpoints are located.
This sounds good in theory, but what are the actual technical characteristics of reusable REST APIs? And how should
reusable APIs be implemented in practice? The reality is that
theres no obvious way to arrive at this development pattern
until youve tried many times the wrong way, at which point
its usually too late.
DreamFactory tackled the API complexity challenge for
over a decade, built a reusable API platform internally for
our own projects and open sourced the platform for any
developer to use. We had to start from scratch many times
before hitting on the right design pattern that enables our
developers to build applications out of general-purpose interfaces.
There are some basic characteristics that any reusable
REST API should have:
REST API endpoints should be simple and provide parameters to support a wide range of use cases.
REST API endpoints should be consistently structured for
SQL, NoSQL and file stores.
REST APIs must be designed for high transaction volume,
hence simply designed.
REST APIs should be client-agnostic and work interchangeably well for native mobile, HTML5 mobile and
web applications.
A reusable API should have the attributes below to support a
wide range of client access patterns:
35
REST
Noun-based endpoints and HTTP verbs are highly effective. Noun-based endpoints should be programmatically
generated based on the database schema.
Requests and responses should include JSON or XML
with objects, arrays and sub-arrays.
All HTTP verbs (GET, PUT, DELETE, etc.) need to be
implemented for every use case.
Support for web standards like OAuth, CORS, GZIP and
SSL is also important.
Its crucially important to have a consistent URL structure
for accessing any backend data source. The File Storage API
should be a subset of the NoSQL API, which should be a subset of the SQL API (Figure4).
36
REST
Conclusion
37
istockphoto.com/Inok
APIs
Milliseconds matter
by Per Buer
In recent years, web APIs have exploded. Various tech industry watchers now see them as providing the impetus for
a whole API economy. As a result and in order to create a
fast track for business growth, more and more companies and
organizations are opening up their platforms to third parties.
While this can create a lot of opportunities, it can also have
huge consequences and pose risks. These risks dont have to
be unforeseen, however.
Companies checklists for building or selecting API management tools can be very long. Most include the need to offer security (both communication security -TLS- and actual
API security -keys-), auditing, logging, monitoring, throttling,
metering and caching. However, many overlook one critical
factor: performance. This is where you can hedge your bets
and plan for the potential risk.
38
APIs
vice. But what is high enough to provide a competitive advantage in our accelerated times? Take for example an industry
like banking, where many players are opening up their platforms in a competitive bid to attract developers who create
third-party apps and help monetise the data. The ones that set
the API call limit too low create a bad developer experience,
pushing them towards friendlier environments.
A limited number of API calls in web services also affects
the end-customer. Take for example online travel operators
or online media. In these environments a lot of data needs
to flow through the APIs. These are becoming more dependent on fast and smooth communication between their services
and their various apps. If these services slow down due to API
call limitations, customers will defect to faster sites.
I compared the situation of APIs with that of the web ten
years ago when performance started to matter. The situation
that actually developed is much more serious than I initially
predicted. Consumers increasingly demand instant gratification. This means that the window for companies to ensure
the performance of their APIs is closing. Being able to deliver
performance and set a higher limit of API calls can make a
huge difference. Otherwise, developers will go elsewhere to
help grow another companys business. If you want to futureproof for the API boom, its time to consider the performance
factor.
Per Buer is the CTO and founder of Varnish Software, the company behind the open source project Varnish Cache. Buer is a former programmer
turned sysadmin, then manager turned entrepreneur. He runs, cross country skis and tries to keep his two boys from tearing down the house.
Imprint
Publisher
Software & Support Media GmbH
Editorial Office Address
Software & Support Media
Saarbrcker Strae 36
10405 Berlin, Germany
www.jaxenter.com
Sebastian Meyen
Coman Hamilton, Natali Vlatko
Kris Beevers, Per Buer, Ben Busse, Holly Cummins, Aysylu Greenberg,
Patricia Hines, Eric Horesnyi, Werner Keil, Angelika Langer, Aviran Mordo,
Chris Neumann, Lyndsay Prewer, Zigmars Raascevskis, Sheldon Smith,
Colin Vipurs, Geertjan Wielenga
Copy Editor:
Jennifer Diener
Creative Director: Jens Mainz
Layout:
Flora Feher, Dominique Kalbassi
Sales Clerk:
Anika Stock
+49 (0) 69 630089-22
astock@sandsmedia.com
Entire contents copyright 2015 Software & Support Media GmbH. All rights reserved. No
part of this publication may be reproduced, redistributed, posted online, or reused by any
means in any form, including print, electronic, photocopy, internal network, Web or any other
method, without prior written permission of Software & Support Media GmbH.
Editor in Chief:
Editors:
Authors:
The views expressed are solely those of the authors and do not reflect the views or position of their firm, any of their clients, or Publisher. Regarding the information, Publisher
disclaims all warranties as to the accuracy, completeness, or adequacy of any information, and is not responsible for any errors, omissions, inadequacies, misuse, or the consequences of using any information provided by Publisher. Rights of disposal of rewarded
articles belong to Publisher. All mentioned trademarks and service marks are copyrighted
by their respective owners.
39