Columbro - Alfresco 4 Scalability and Benchmarking

Alfresco Scalability Benchmarking
Before telling how cool Alfresco is, you better prove it!
Few items well talk about
Scalability and benchmarking in the ECM context Alfresco 4 is rocket scalable. And we got proofs
If your software scales, your BM framework must scale more!
Scalability and benchmarking

What do you know about it?
One size fits all
How scalable is your ECM?

ECM systems can be scalable or they can fail to scale well. They can have modular architectures that allow you to simply add more elements as required, rather than multiply the entire system as things expand. They can be scalable in that they have built in high availability, automatic failover support, run on enterprise grade application servers and databases. They can be scalable because they have been tested and proven to handle very high volumes (hundreds of millions of documents) in the repository and/or tested and proven to handle very high throughput rates (tens of thousands per hour or minute). There are many ways in which an ECM system can scale or not. But the biggest element determining whether the system can scale is your usage of it
http://www.realstorygroup.com/Blog/1403-Scalable-ECM
Alfresco ECM Solutions
Factors at Alfresco Scalability

Just so we talk the same language:
What you need to know about Alfresco 4?

Key scalability changes:
The game changer(s) Node Creation
Alfresco 3.x
Game changer(s) Node Creation
Alfresco 4.x
NOTE: Solr will index with a default period of 15s
Game changer(s) Node Search
Alfresco 3.x
The game changer(s) Node Search
Alfresco 4.x
NOTE: Solr needs to track ACLs!
Alfresco 4 Scalability Benchmarks

The journey to the ultimate knowledge of Alfresco
Why benchmarking?
Verba volant, scripta manent

Alfresco Product
Validation of Alfresco 4 scalability Product bottlenecks
Alfresco Field
Sizing Achieve even more stellar use-cases
Customers & partners

Provide reference milestones Allow contextualized benchmarks
Benchmark projects overview

BM-0001 Alfresco 4 Scalability 4.0.0 BM-0002 - 3.4 vs. 4.0 (Share Scenario - 360 u 1 node) 3.4.8 4.0.0 BM-0003 - SOLR vs. Lucene (Share Scenario) 4.0.0 BM-0004 - Uncover repository limit using simple Site-based data 4.0.2 BM-0005 - Measure Alfresco Cloud signup rates up to 125K users - Cloud BM-0006 - Measure Activiti workflow service performance 4.0.2 BM-0007 - Measure Alfresco workflow API performance 4.0.2 BM-0008 - Simulate 125K users on Alfresco Cloud Cloud
BM-0009 - Define optimal tuning and extrapolate sizing information for large scale Share Enterprise deployment - 4.1.1.x
BM-0001 and BM-0009

Common factors
Benchmark Lab Enterprise Collaboration Scenario Technologies (Alfresco Enterprise + Jmeter)
Differences
Load testing scripts Database tier Repository content
Both provided useful insight!
The benchmark lab (HW)
BM-0001 Alfresco 4 Scalability benchmark

You can never forget your first time! Objective
Statistic analysis of Alfresco Scalability Best effort pre-tuning
Scenario
Search intensive Collaboration scenario 10s think time Implemented with Jmeter
Async requests Memory intensive
BM-0001 Scenario
http://svn.alfresco.com/repos/alfresco-open-mirror/benchmark/scripts/SHARE/share-0001/V4.0.0/
BM-0001 Scalability data points
BM-0001 Architecture
BM-0001 - Software
BM-0001 Scalability results
In other words
BM-0001 take-aways
1100 concurrent users on 10M docs! With high search %, load is mostly on Solr Share is lightweight, repo not loaded Solr can be memory intensive
Make sure you give enough memory! Scale out when needed!
Dedicated Alfresco for Solr tracking beneficial
The Alfresco 4 Scalability blueprint

Broad intro to Alfresco 4 Scalability Scalability analysis of BM-0001 results Architectural options Tuning and configuration details Load analysis Sizing and performance reference Available for Enterprise Customers & Partners at http://support.alfresco.com
BM-0009 Real life collaboration

Errare humanum est, perseverare diabolicum! Objective
Create a real repository and scenario
2*Alfresco + 2*Solr
Find optimal tuning / sizing for large concurrency
Scenario
Much less search intensive than BM-0001 15s think time Implemented with Jmeter
More Async requests Less Memory intensive
BM-0009 Scenario
https://svn.alfresco.com/repos/alfresco-enterprise/benchmark/scripts/SHARE/share-0002/4.0.2/
BM-0009 Repository details

Share Sites: 10 000 Avg members per site: 100 + 3 random groups #files per site: 10 2MB docs + 1k docs of 1kB
3 level deep folder structure, each folder of the hierarchy contains 5 documents and 5 folders
Users in repo: 50 000

Groups: 30 000 hierarchical (depth=7) Storage used: 1TB Number of nodes: >10M Number of assocs: >10M child assocs
BM-0009 Architecture
BM-0009 - Software
Tier machines Balancing Tier OS RHEL5 Relevant Software Apache Httpd 2.3 Details mod_proxy and mod_proxy_ajp to use Httpd to balance requests Using Shared Storage for shared content store via NFS Single node deployment Indexes on local RAID 5 disk Exposing an NFS Share mounted on Alfresco cluster nodes Running JMeter
Alfresco Tier
RHEL5
Alfresco 4.1.1.2 Apache Tomcat 6.0.29 Mysql 5.5.25 Alfresco Solr 4.1.1.2 NFSd
DBMS Tier Index Tier
RHEL5 RHEL5
Shared Storage RHEL5
Load Test Client Drivers
RHEL5
Jakarta JMeter 2.5.1
BM-0009 Scalability results
Once again
BM-0009 take-aways
Did I already say Alfresco ROCKS? Even on a high end realistic scenario, avg time 1.2s with 500 concurrent users (with 2*Alfresco and 2*Solr) Dedicated Alfresco tracking beneficial mostly for operational purposes (similar performance) Adding Solr nodes allows further degrees of scalability
Lessons learned (tuning tips)

As we are not in marketingnumbers are not enough!
Make sure house is clean!

Allow enough Open files
Allow enough connections to your DB Allow enough Tomcat threads (but dont exaggerate!) Disable your anti-virus!
Check your balancer, and first test without it!
Areas you might want to tune

Alfresco Repository
db.pool.* (and DB connections accordingly) (only if your know what youre doing) L2-caches and transactional caches solr.maxConnections
Solr
Solr caches in solrcore.properties #tracking threads (default=3). I/O bound so dont exaggerate! alfresco.maxConnections mergeFactor in solrconfig.xml
Dedicated vs. shared tracking
Dedicated vs. shared tracking
Benchmark gotchas
Jmeter is very memory intensive
30G to scale to 1100 users! Not fit for cloud-scale
Jmeter does not parse Javascript

Complex hacked implementation to mimic Share Full emulation is impossible!
We need a scalable, distributed, flexible framework to run cloud-scale benchmarks!
Alfresco Benchmark framework

I.e. cool people doing even cooler stuff to prove they are cool!
Derek Hulley, Repository Lead Engineer
Gabriele Columbro Principal Architect Consulting Services
@mindthegabz
Derek Hulley Founding Engineer and Repository Team Lead

Columbro - Alfresco 4 Scalability and Benchmarking

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Columbro - Alfresco 4 Scalability and Benchmarking

Hochgeladen von

Copyright:

Verfügbare Formate

Alfresco Scalability Benchmarking

Few items well talk about

If your software scales, your BM framework must scale more!

Scalability and benchmarking

One size fits all

How scalable is your ECM?

Alfresco ECM Solutions

Factors at Alfresco Scalability

What you need to know about Alfresco 4?

The game changer(s) Node Creation

Game changer(s) Node Creation

NOTE: Solr will index with a default period of 15s

Game changer(s) Node Search

The game changer(s) Node Search

NOTE: Solr needs to track ACLs!

Alfresco 4 Scalability Benchmarks

Verba volant, scripta manent

Customers & partners

Benchmark projects overview

BM-0001 and BM-0009

Both provided useful insight!

The benchmark lab (HW)

BM-0001 Alfresco 4 Scalability benchmark

BM-0001 Scalability data points

BM-0001 Scalability results

Dedicated Alfresco for Solr tracking beneficial

The Alfresco 4 Scalability blueprint

BM-0009 Real life collaboration

Find optimal tuning / sizing for large concurrency

BM-0009 Repository details

Users in repo: 50 000

DBMS Tier Index Tier

Shared Storage RHEL5

Load Test Client Drivers

Jakarta JMeter 2.5.1

BM-0009 Scalability results

Lessons learned (tuning tips)

Make sure house is clean!

Areas you might want to tune

Dedicated vs. shared tracking

Dedicated vs. shared tracking

Jmeter does not parse Javascript

We need a scalable, distributed, flexible framework to run cloud-scale benchmarks!

Alfresco Benchmark framework

Derek Hulley, Repository Lead Engineer

Gabriele Columbro Principal Architect Consulting Services

Das könnte Ihnen auch gefallen