Beruflich Dokumente
Kultur Dokumente
Paper:
Inside:
Balancing
your
approach
to
Big
Data
Criteria
for
evaluating
your
enterprise
approach
Tips
for
getting
started
Four
For the last four years, the research team at CTOlabs.com has been contributing to studies and analysis and community events on the topic of Big Data. Through our leadership of venue like the yearly Government Big Data Forum and our continuous dialog with thought leaders via our Weekly Government Big Data Newsletter we have sought to highlight lessons learned, share best practices, and foster a greater dialog between and among practitioners fielding real world solutions in the Big Data space. Our research team, led by former CTO of the Defense Intelligence Agency Bob Gourley, has been collecting community advice and success tips on the implementation of Big Data projects with the goal of continually feeding those back to the community to enhance as many efforts as possible. This presents design criteria, best practices and lessons learned in a way you can use to enhance your organizations approach to Big Data. We constructed this piece in a way we hope you will find logical and compelling, but are ready at any time to provide more background, insights, introductions to thought or other additional information. Contact us at CTOlabs.com at any time to weigh in with your thoughts.
Lessons learned from the national security community give us a framework to solve Big Data challenges
National security enterprises, including military and intelligence organizations and the commanders that depend on them, were among the first to face todays big data challenges. National security missions have long required a rigorous analytical tradecraft, sensemaking, to emphasize the action orientation of operation analysis. Sensemaking is the creation of knowledge and the optimization of decisions from data. Sensemaking enables organizations to develop situational awareness and make maximum use over their data holdings. In the national security space, the most promising big data solutions are those that enable sensemaking in a balanced way - where analysts are empowered to do that they do best but supported by technologies that do what humans cannot - and are governed by policies forged from experience..We will leverage this conceptual framework of people, technology and policy in a more proscriptive way below.
enterprise architecture can enable computers to apply analytics holistically over data holdings, comparing many millions of relationships and correlations to each other, leading to enhanced discovery and knowledge creation for presenting to analysts. The differentiation between Human and machine computational reasoning is significant. Humans are incredibly flexible, adaptive, and broad in their reasoning constructs but are not deep and therefore not able to handle large amounts of information or reasoning tasks. Computers on the other hand, are incredibly efficient in handling handle large amounts of information or reasoning tasks, but are not flexible, adaptive nor broad in their reasoning constructs. As a side effect, the vast amount of human reasoning is in the sub-conscious level meaning it is difficult to near-impossible to audit the logic trail. Computer reasoning is exposed and therefore auditable. Audit-ability is difficult with human reasoning forcing confidence assessments to be based on past performance, not logic assessment of the analysis in question. The key takeaway though for heightened analysis efficiency is to balance the human and the computer in a analytic functional pairing. This takes advantage of the best of both worlds. This pairing however is not just limited to human sensory assist functionally such as data visualization, it is also in the extension of compare and contrast operations (pairwise analysis) that effectively discovers more from difficult evidence in a Big Data environment. Without this pairing, the ability to exploit Non Obvious relationships (NOR) is limited and the results are sub-par.
that work over all types of data help balanced organizations leverage all their data holdings.
Veracity: What is the true meaning of the data? Pre-processing of data ensures the system knows data provenance and can assess validity, at speed. This is important with all data sources but is especially relevant in data created or touched by humans, such as social media. Technological contributions to veracity should also include advanced identity assessment/entity extraction and also relationship building. Volatility: When is data valuable or when is the data most valuable? Data often has a half-life of value meaning it is valuable for a certain period of time. What makes this even more complex is that often this volatility is related to the availability and detection of other like volatile data. Its like putting a puzzle together on a moving board responsive technologies like analytic visualization combined with automated compare/contrast (pairwise) operations assist here a balanced approach is neededone or the other may be incomplete.
Big Data alone however is not enough to fully grasp the significance of the intelligence and/or investigatory analysis problem. Doug Laneys highly applicable Big Data characterization model is focused on data within that data we need to further explore the impact of complex evidence that is difficult to isolate, identify, and comprehend. This is the additional concept of Difficult Evidence. Like Big Data, Difficult Evidence has several dimensions: Sparse: This is the ratio between the evidence that matters (relates and has analytic impact to the question or information goal) to the information at hand. The proverbial needles in the haystack or often needles in the needle stack. Technology assists with filtering out the overall body of non-applicable information but elevating the essential elements with analytics is essential to finding the key evidence. Obscure: Key evidence is rarely obvious. It can be incomplete, inaccurate, vague, or intermixed with non-applicable information in many cases. These obscuration factors make pulling the essential evidentiary patterns extremely challenging. Obscuration cloaks the essential meaning of the evidence. Ambiguous: Evidence can mean different things to different analysts. Like obscuration, ambiguity cloaks the meaning of the evidence in relation to the other evidence. This is a contextual obscuration. Disambiguation is best accomplished by relation of other evidence to the ambiguous information thus enhancing the context and eliminating multiple meanings. Fragmented: Evidence is often not complete. Fragmentation occurs due to the nature of the information or as an artifact of its gathering and storage. Whether the silo-ing of the information is internal or external, the result is the same evidence must often be identified, partially understood, holistically recognized, and associated with whats missing before context is achieved.
Functionality
Tailorability
Accessibility
Interoperability
Team Support
Can analysts work across the collaborative spectrum from one independent analyst to an entire enterprise collaborating together? Can analysts move their conclusions quickly to others on the team and to decision-makers? Have coalition sharing capabilities been engineered that enable sanitization while protecting the essence of the information? As new conclusions and insights are developed they need to be smartly captured to build upon and for continued fusion and analysis. This can include knowledge from partners and others outside the organization.
Knowledge Capture
The criteria above are best assessed in conjunction with experienced analysts who know your organizations mission and function and are familiar with their current tools. But keep in mind that analysts are not paid to know the full potential of modern technologies. Additional evaluation factors below will be best evaluated in conjunction with both your internal technology team and the broader technology community.
Data Grooming
MultiDimensional Security
Synchronized
Do you have an architecture that facilitates sharing of secure information for both service (request/response) and notification (publish/subscribe) via widely supported standards and best practices? Does this architecture provide the flexibility and adaptability needed to keep pace with the change and evolution of the data type and volume, the analytic tools, and the analytic mission? Can enterprise IT staff tailor the capabilities for analyst use, or do they need to task an outside vendor to re-code capabilities?
Enhancability
Most modern enterprises, especially those in the national security community, have already been building towards more service-oriented, data smart structures, so it is very likely that your organization has a good foundation along this path. But remember it is a journey, and the balanced approach your mission requires may well require changes to your configuration and perhaps even more modern technologies to optimize your ability to support the mission.
Whatever the status of your technology infrastructure, you will need a good governance process in place to move to a more optimized infrastructure.
Concluding Thoughts
Every enterprise is different, with different missions, different infrastructures and architectures. You may find that many of the criteria we outlined above are already met by your existing enterprise. A quick inventory of capabilities and gaps will help you assess the challenge and prioritize how you architect for improvement. We most strongly recommend a structured engagement with your organizations analysts. They understand your organizations mission and vision and will likely be strong supporters in your move to bring more balance to your organizations approach to big data analytical solutions. Their prioritization of needs and capabilities should help drive organizational improvement plans. However, keep in mind that your analysts are not paid to understand the power of modern computing. External advice and assistance in this area, including connecting with other organizations that have met similar challenges, will provide important insights into your road ahead. We have observed organizations making this type of transformation around the globe, including commercial organizations, government agencies and militaries. One thing all seem to have in common is a deep need to automate with efficiency. For some this translates to a calculation of Return on Investment. For militaries it can be a more operationally focused Return on Mission. But in every case, understanding the efficiencies and total cost to the enterprise of a solution is critically important to ensuring success.
More Reading
For more federal technology and policy issues visit:
CTOvision.com- A blog for enterprise technologists with a special focus on Big Data. CTOlabs.com - A reference for research and reporting on all IT issues. J.mp/ctonews - Sign up for the government technology newsletters including the Government Big Data Weekly.
Contact:
CTOlabs.com
8