Beruflich Dokumente
Kultur Dokumente
Classification of NoSQL databases based on data model: A basic classification based on data
model, with examples:
Document: Clusterpoint, Apache CouchDB, Couchbase, DocumentDB, HyperDex, Lotus
Notes, MarkLogic, MongoDB, OrientDB, Qizx
Key-value: CouchDB, Oracle NoSQL Database, Dynamo, FoundationDB, HyperDex,
MemcacheDB, Redis, Riak, FairCom c-treeACE, Aerospike, OrientDB, MUMPS
Graph: Allegro, Neo4J, InfiniteGraph, OrientDB, Virtuoso, Stardog
Multi-model: OrientDB, FoundationDB, ArangoDB, Alchemy Database, CortexDB
Supports Yes, updates can be configured to In certain circumstances and at certain levels
8.
Transactions complete entirely or not at all (e.g., document level vs. database level)
Specific language using Select,
Data Insert, and Update statements, e.g.
9. Through object-oriented APIs
Manipulation SELECT fields FROM table
WHERE…
Depends on product. Some provide strong
Can be configured for strong
10. Consistency consistency (e.g., MongoDB) whereas others
consistency
offer eventual consistency (e.g., Cassandra)
Unit III: SQL using R
3.2. Connecting R to NoSQL databases
Lab Activity: No SQL Example: R script to access a XML file:
Step1: Install Packages plyr,XML
Step2: Take xml file url
Step3: create XML Internal Document type object in R using xmlParse()
Step4 :Convert xml object to list by using xmlToList()
Step5: convert list object to data frame by using ldply(xl, data.frame)
install.packages("XML")
install.packages("plyr")
> fileurl<-"http://www.w3schools.com/xml/simple.xml"
> doc<-xmlParse(fileurl,useInternalNodes=TRUE)
> class(doc)
[1] "XMLInternalDocument" "XMLAbstractDocument"
> doc
<?xml version="1.0" encoding="UTF-8"?>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple
syrup</description>
<calories>650</calories>
</food>
<food>
> xl<-xmlToList(doc)
> class(xl)
[1] "list"
> xl
$food
$food$name
[1] "Belgian Waffles"
$food$price
[1] "$5.95"
$food$description
[1] "Two of our famous Belgian Waffles with plenty of real maple syrup"
$food$calories
[1] "650"
$food
> data<-ldply(xl, data.frame)
> head(data)
.id name price
1 food Belgian Waffles $5.95
2 food Strawberry Belgian Waffles $7.95
3 food Berry-Berry Belgian Waffles $8.95
4 food French Toast $4.50
5 food Homestyle Breakfast $6.95
Description:
1 Two of our famous Belgian Waffles with plenty of real maple syrup
2 Light Belgian waffles covered with strawberries and whipped cream
3 Light Belgian waffles covered with an assortment of fresh berries and whipped cream
4 Thick slices made from our homemade sourdough bread
5 Two eggs, bacon or sausage, toast, and our ever-popular hash browns
calories
1 650
2 900
3 900
4 600
5 950
2. XLConnect: It might be slow for large dataset but very powerful otherwise.
require (XLConnect)
wb <- loadWorkbook("myfile.xlsx")
myDf <- readWorksheet(wb, sheet = "Sheet1", header = TRUE)
3. xlsx: This package requires JRM to install. This is suitable for java supported envirments.Prefer
the read.xlsx2() over read.xlsx(), it’s significantly faster for large dataset.
require(xlsx)
read.xlsx2("myfile.xlsx", sheetName = "Sheet1")
xlsReadWrite: Available for Windows only. It’s rather fast but doesn’t support. .xlsx files which is a
serious drawback. It has been removed from CRAN lately.
read.table(“clipboard”): It allows to copy data from Excel and read it directly in R. This is the
quick and dirty R/Excel interaction but it’s very useful in some cases.
myDf<- read.table("clipboard")
Here’s an example of how to integrate the batch file above within your VBA code.
Rexcel is the only tool I know for the task. Generally speaking once you installed RExcel you insert the
excel code within a cell and execute from RExcel spreadsheet menu. See the RExcel references below
for an example.
This is something I came across but I never tested it myself. This is a two steps process. First
write a VBscript wrapper that calls the VBA code. Second run the VBscript in R with the
system or shell functions. The method is described in full details here.
RExcel is a project developped by Thomas Baier and Erich Neuwirth, “making R accessible
from Excel and allowing to use Excel as a frontend to R”. It allows communication in both
directions: Excel to R and R to Excel and covers most of what is described above and more. I’m
not going to put any example of RExcel use here as the topic is largely covered elsewhere but I
will show you where to find the relevant information. There is a wiki for installing RExcel and
an excellent tutorial available here. I also recommend the following two documents: RExcel –
Using R from within Excel and High-Level Interface Between R and Excel. They both give an
in-depth view of RExcel capabilities.