Willkommen bei Scribd!

Setup of The Cloud Platform - JS - OCt27

Hochgeladen von

0% fanden dieses Dokument nützlich (0 Abstimmungen)

23 Ansichten2 Seiten

The cloud platform enhances the data collection and processing with the following things: a 7x 24 hours data collection from Sina-Weibo Robust data storage Efficient data processing and distilling capabilities based on dataset Powerful data visualization Details about the cloud platform: 1. 2. 3. Hardware: The server hardware configuration is Xeon E5 CPU with 8 core 16 threads. 48 Gbytememory. Data is managed in central database. MySQL may not be an ideal choice due to its low efficiency to handle big amount

Originalbeschreibung:

Originaltitel

Setup of the Cloud Platform_JS_OCt27

Copyright

Verfügbare Formate

DOCX, PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als DOCX, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

23 Ansichten2 Seiten

Setup of The Cloud Platform - JS - OCt27

Hochgeladen von

Soap1212

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als DOCX, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 2

Im Dokument suchen

The Cloud For Social Media Big Data Collection, Computing, Visualization and Delivery

The cloud platform enhances the data collection and processingwith the following things: A 7x 24 hours data collection from Sina-Weibo Robust data storage Efficient data processing and distilling capabilities based on dataset Powerful data visualization Details about the cloud platform: 1. 2. 3. Hardware: The server hardware configuration is Xeon E5 CPU with 8 core 16 threads. 48 Gbytememory. Operating System of Virtual Machines: Linux Debian or Ubuntu is used to run programs/ systems for data collection and processing. Database: Data is managed in central database. MySQL may not be an ideal choice due to its low efficiency to handle BIG amount of data. A graph database may be considered as the graph database store and present 4. data with the graph structure of nodes, edges and properties. (http://en.m.wikipedia.org/wiki/Graph_database) Programming language and the environment: Weibo APIs only support JSON, using Java is used to collect data from Sina-Weibo and Twitter. Eclipse will be used as the compilation environment. 5. Distributed data collection: The VMs for data collection should be as many as possible and we can keep it very small in Linux like 256MB-512MB memory. Given that one server has 64GB memory we probably can have hundreds of VMs within one server. We can start to test with a small number of VMs and gradually increase the number. 6. The methodology Data collection process: The iteration of data collection should be start either from a user or a status (or keyword). The data should be re-scraped because the data we have now contain no accurate node information of edges and no status content is included. If re-scrap, additional information is: exact followers and following lists, content of status, comment and reply and comment or reply user information. Problems: 1. Choice of database, which handles big data efficiently. Evaluation of graph database by creating a random large network and gradually increases the number of nodes and edges. Then do some basic stats like degree, average path length, community detection, etc. Tracking time and memory usages for different operations. 2. 3. 4. Managing large-scale of VMs: This will be a part of our future work on the cloud platform. Archivingand managing the big data. We probably need Hadoop with Mapreduce or the other The token expiration problem due to Sina-Weibo's policy, the access token for TEST purpose

more powerful tools. expires every 24 hours. In the past, we just manually revoked the access tokens every day. Can we borrow any idea from Oxford's solution? Verify with Ning, if twitter has the same issues. Otherwise, we will write a program to get token automatically, and trigger the current scrapper to use a new token automatically.

Frame of cloud platform: The frame of the cloud platform is shown as below:
Cloud Platform
Data Crawler Central Database Data Visualisation Data Selection

VM1
Main Function Data Crawler

VM2

...

VMn

Token refresher

Figure 1. Frame of Cloud Platform

Das könnte Ihnen auch gefallen

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (895)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5794)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (588)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1891)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (400)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (266)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (345)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (74)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2259)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1016)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1713)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1839)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (121)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4610)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (792)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2104)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4200)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
Macbook 2011 Radeon GPU Disable - Real Radeongate Solution - Black PDF
Dokument9 Seiten
Macbook 2011 Radeon GPU Disable - Real Radeongate Solution - Black PDF
Winai Jan
Noch keine Bewertungen
Smart To-Do List: Software Requirement Specification (SRS)
Dokument18 Seiten
Smart To-Do List: Software Requirement Specification (SRS)
Prabhu V Prabhu
Noch keine Bewertungen
Rhel 5 6 7 8 Cheatsheet 8.5x11 0519 0
Dokument5 Seiten
Rhel 5 6 7 8 Cheatsheet 8.5x11 0519 0
hoadi
Noch keine Bewertungen
Flight Gear
Dokument219 Seiten
Flight Gear
123azeqsdwxc
Noch keine Bewertungen
Embedded System - Embedded Linux (Full-Final)
Dokument47 Seiten
Embedded System - Embedded Linux (Full-Final)
Phan Duy
Noch keine Bewertungen
Monitoring With Prometheus
Dokument37 Seiten
Monitoring With Prometheus
phahok
100% (1)
Distro Make
Dokument17 Seiten
Distro Make
saisreenathk
100% (2)
Installing Guardium in A Virtual Machine
Dokument6 Seiten
Installing Guardium in A Virtual Machine
Mohsin Ali
Noch keine Bewertungen
Eagle Manual-7.1 - en
Dokument347 Seiten
Eagle Manual-7.1 - en
onodime
Noch keine Bewertungen
IVI Book
Dokument17 Seiten
IVI Book
Nikhil Chand Chitta
Noch keine Bewertungen
XV6 Project
Dokument38 Seiten
XV6 Project
Sobhan Sai Kuriti
Noch keine Bewertungen
Brutal Doom Linux Install
Dokument6 Seiten
Brutal Doom Linux Install
Mike Smith
86% (7)
Comparison of Linux Distributions
Dokument12 Seiten
Comparison of Linux Distributions
poopooheadme
Noch keine Bewertungen
Installing Windows Server 2008 R2
Dokument197 Seiten
Installing Windows Server 2008 R2
Rgen Al Vill
Noch keine Bewertungen
COC3 Edited
Dokument208 Seiten
COC3 Edited
arkie
Noch keine Bewertungen
NetBackup Redirected Oracle Restore For UNIX and Linux
Dokument6 Seiten
NetBackup Redirected Oracle Restore For UNIX and Linux
joefromdayton
Noch keine Bewertungen
Embedded Linux: Jelena Ignjatovic, Middlebury College, Middlebury, Vermont
Dokument17 Seiten
Embedded Linux: Jelena Ignjatovic, Middlebury College, Middlebury, Vermont
udayb
Noch keine Bewertungen
Kangaroo User Guide
Dokument11 Seiten
Kangaroo User Guide
Uk mats
Noch keine Bewertungen
Docu88653 EMC Storage Monitoring and Reporting 4.2.1 Support Matrix
Dokument12 Seiten
Docu88653 EMC Storage Monitoring and Reporting 4.2.1 Support Matrix
Hooman Mohaghegh
Noch keine Bewertungen
Kali Linux - The Beginners Guide On Ethical Hacking With Kali
Dokument75 Seiten
Kali Linux - The Beginners Guide On Ethical Hacking With Kali
Hector Contreras
100% (4)
Install Guide
Dokument63 Seiten
Install Guide
ĐểGióCuốnĐi
Noch keine Bewertungen
LATEX Cookbook - Sample Chapter
Dokument58 Seiten
LATEX Cookbook - Sample Chapter
Packt Publishing
Noch keine Bewertungen
Avaya Aura®
Dokument44 Seiten
Avaya Aura®
Mohannad Ahmed
Noch keine Bewertungen
Luqman Mind Map
Dokument2 Seiten
Luqman Mind Map
Muhammad Fajar
Noch keine Bewertungen
Xenserver 7 0 Supplemental Packs Guide
Dokument34 Seiten
Xenserver 7 0 Supplemental Packs Guide
rohight
Noch keine Bewertungen
Linux Fresher CV Format
Dokument4 Seiten
Linux Fresher CV Format
Sachin Nakti
100% (3)
Untitled
Dokument37 Seiten
Untitled
Augustin Gasciuc
Noch keine Bewertungen
RAIDIX ERA 3.4.0 Installation Guide
Dokument5 Seiten
RAIDIX ERA 3.4.0 Installation Guide
Rufat Ibragimov
Noch keine Bewertungen
TLE ICT 10 - Operating System - Activity Sheet
Dokument6 Seiten
TLE ICT 10 - Operating System - Activity Sheet
GioSanBuenaventura
Noch keine Bewertungen
openSAP Suse1-Pc Week 1 All Slides
Dokument36 Seiten
openSAP Suse1-Pc Week 1 All Slides
Pedro Fernández Hernández
Noch keine Bewertungen