Sie sind auf Seite 1von 3

Memo

from Analytix.is

Top Questions for Evaluating Hadoop Platform Vendors


Category
Strategic Fit
Strategic Fit
Strategic Fit
Strategic Fit
Strategic Fit
Strategic Fit
Strategic Fit

Strategic Fit

Strategic Fit

Strategic Fit

Strategic Fit

Strategic Fit
Strategic Fit
Strategic Fit

Requirement
Market Presence: What was your CY2014 worldwide revenue
from Big Data solutions?
Number and Quality of References: Please provide 10
customers in our industry globally. At least 5 of these must be
willing to entertain a reference check
Hadoop Expertise: How long have your technical leaders
worked with Apache Hadoop?
Apache Leadership: How many of your engineers are Apache
Software Foundation (ASF) committers and PMC members, for
relevant Hadoop and Spark projects?
Open Source Philosophy: List components of your distribution
that are governed by projects within the ASF and which are not.
Differentiators: How is your solution differentiated vis-a-vis
your competitors? Please name the specific competitor products
in any comparison you provide
EDW Integration: Please describe how queries with predicate
pushdown can be initiated on the existing EDW and the execute
in distributed but coordinated fashion across both Hadoop and
the EDW. Please list any certifications in that respect
Modular Architecture based on open standards: Please
provide evidence that proprietary components are absent or
widely supported by an ecosystem with clearly specified APIs;
any evidence that your solution is not a black box and has no
vendor lock-in
Roadmap: Please provide as much clarity as possible on your
roadmap. What is the anticipated release date and new feature
list for each of the product(s) component(s) over the next 12
months?
Technology Partnerships / Product Extendability: Please list
which relevant third-party software in areas like visualization,
application development, machine learning, data science,
security is certified on your distribution
Analytic App Dev: What functionality enables analytic
application development, such as well-maintained APIs,
programming frameworks in common languages, an integrated
developer environment / SDK
Visualization: Access from existing reporting & visualization
tools (QLIK, BOBJ Explorer, Tableau). Please list which are
certified.
Industry-specific IP: Please list out-of-the-box functionality for
specific use cases in our industry
Service Intensity: For typical deployments at your customers
(please state parameters indicating scope / size of deployment),
how much professional services FTE days are required for a) the
deployment, b) ongoing operations
1

Memo from Analytix.is


Strategic Fit
Strategic Fit
Strategic Fit
Commercial
Needs
Commercial
Needs
Commercial
Needs
Commercial
Needs
Commercial
Needs
Data
Management
Data
Management
Data
Management
Data Access
Data Access
Data Access

Data Access
Integration &
Governance

PS Ecosystem: Please describe the breadth and depth of your


Professional Services Ecosystem (listing number of partners and
respective number of certified staff)
Testing: Describe your testing and certification process, prior to
releasing a new version of your distribution
Switching Costs: Can we use the ASF community to self-support
your solution? How quickly do ASF fixes make it into the
software?
Speed of deployment: ability to support a pilot in 1 month, full
production in 3 months. E.g., can some solution modules be
downloaded, pilot data ingested, staff trained while commercials
are still being worked on?
Support capability 24x7 in English including SPOC for L1 calls
Training: Please list your publicly available training &
certification offerings
Flexible pricing structure that enables extensions of capacity
and capability in modular increments. Explain your pricing
structure and list price levels.
Data Volume: Describe how performance and storage pricing
changes at different amounts of data.
High Availability: What HA features do you provide to
minimize data center outages? Does it support zero downtime?
Portability & Deployment: What operating system(s) do you
support? Can we deploy in the cloud (e.g., for test/dev)?
Heterogeneous Infrastructure: How does your Hadoop
solution deal with disparate hardware pools in one data center
(e.g., newer hardware next to older hardware; memory vs.
storage optimized)
Extensibility: Would we be able to add new data applications
that run in YARN? What YARN-ready ISV apps run on your
platform?
Use Cases: Do you have customers in our industry that use your
distribution? Please describe the scope and use cases for up to
10 customers in our industry
Processing Engines: Can we simultaneously process data
residing in the same multi-purpose cluster with many different
processing engines? Which big data relevant processing engines
does your offering include? Our needs range across batch,
interactive, real-time, streaming, search, machine learning, rules
engine, recommendation engine, and data science
Existing Skills: Will our analysts be able to use their existing
SQL skills to query data in Hadoop? Please list the SQL
instruction set supported by your distribution
Data Ingest: How would you move our raw data sources into
Hadoop? What specific connectors do you provide for common
telco network data sources / vendors? Please describe how
your solution supports a wide range of ingestion capabilities
2

Memo from Analytix.is


incl. NFS access to HDFS
Integration &
Governance
Integration &
Governance
Security
Security

Security
Security

Security
Security

Operations
Operations
Operations
Operations

Replication and Retention: How will we be able to set


centralized policies for data replication and retention?
Pipeline Monitoring: How do we gain insight into source
lineage of data in Hadoop? Can we set explicit policies for data
flows?
Administration: Do you provide central administration of
security policy within the cluster? How do you coordinate
enforcement across workloads?
Authentication: How does your solution verify the identity of
users and systems accessing the cluster? Please describe how
you achieve granular role-based access control via AD, LDAP,
Kerberos, Federated Identity, etc.
Authorization: How do you provide fine-grain authorization to
access data in the cluster?
Audit: How will we be able to audit the actions taken by
individual users? Please describe your auditing capabilities to
track changes to configurations and data access (MR jobs, REST
API, etc.). Provide a list of activities or events, with
corresponding information and attributes that are logged.
Multi-Tenancy Internal: Please describe how you achieve
internal multi-tenancy: Tenant, data, network and namespace
separation in all services
Multi-Tenancy External: Please describe how you achieve
external multi-tenancy: Support of tenants outside of our
company served by the same physical data lake instance that
serves internal tenants
Provision: What operating system(s) do you support? What are
our cloud/on-premises deployment options?
Manage: What tools do you provide for the ongoing
management of a Hadoop cluster? If GUI, please include screen
shots
Monitor: Do you provide a single pane of glass to monitor all
cluster components, with respect to performance, availability,
and other runtime characteristics?
Extend: Can your management tool integrate with existing
operations consoles? In particular HP Openview, IBM and
Teradata Viewpoint

Das könnte Ihnen auch gefallen