Sie sind auf Seite 1von 23

log4j.rootLogger=${root.

logger}
5
5
log.dir=/var/log/spark
6
6
log.file=spark-worker-gmo-cl-data-01.log
7
7
max.log.file.size=200MB
8
-max.log.file.backup.index=10
8
+max.log.file.backup.index=5
9
9
log4j.appender.RFA=org.apache.log4j.RollingFileAppender
10
10
log4j.appender.RFA.File=${log.dir}/${log.file}
11
11
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
12
12
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
Find the log4j for spark and change the log directory manually
find / -ctime -0.1 2>/dev/null | grep log4j.properties <- find files that were m
odified in the last 24*60 * 0.1 minutes
/etc/hadoop/conf.cloudera.yarn/log4j.properties
/etc/hadoop/conf.cloudera.hdfs/log4j.properties
/etc/spark/conf.cloudera.spark/log4j.properties
/var/run/cloudera-scm-agent/process/2340-deploy-client-config/yarn-conf/log4j.pr
operties
/var/run/cloudera-scm-agent/process/2336-deploy-client-config/aux/client/log4j.p
roperties
/var/run/cloudera-scm-agent/process/2336-deploy-client-config/spark-conf/log4j.p
roperties
/var/run/cloudera-scm-agent/process/2330-deploy-client-config/hadoop-conf/log4j.
properties
/var/run/cloudera-scm-agent/process/2324-spark-SPARK_WORKER/config/log4j.propert
ies <- this one had old /var/log/spark and new index
/var/run/cloudera-scm-agent/process/2324-spark-SPARK_WORKER/aux/client/log4j.pro
perties
/var/run/cloudera-scm-agent/process/2324-spark-SPARK_WORKER/log4j.properties <this one had old /var/log/spark
/var/run/cloudera-scm-agent/process/2324-spark-SPARK_WORKER/hadoop-conf/log4j.pr
operties
log.dir=/var/log/spark
log.dir=/mnt/vol1/var/log/spark (worker default group advanced configuration saf
ety valve)
log.threshold=INFO
main.logger=RFA
root.logger=${log.threshold},${main.logger}
log4j.rootLogger=${root.logger}
-log.dir=/var/log/spark
+log.dir=/mnt/vol1/var/log/spark
log.file=spark-worker-gmo-cl-data-01.log
max.log.file.size=200MB
max.log.file.backup.index=5

log4j.appender.RFA=org.apache.log4j.RollingFileAppender
[dchtchou@gmo-cl-edge-02 SimpleApp]$ ls -a
. .. project run run~ simple.sbt simple.sbt~ src target
[dchtchou@gmo-cl-edge-02 SimpleApp]$ mkdir lib
[dchtchou@gmo-cl-edge-02 SimpleApp]$ cp /home/dchtchou/src/ion-pcm/SPARK/jars/sp
ark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0-cdh5.1.0.jar lib
[dchtchou@gmo-cl-edge-02 SimpleApp]$ ./run
[info] Set current project to Simple Project (in build file:/home/dchtchou/src/i
on-pcm/SPARK/SimpleApp/)
[info] Compiling 1 Scala source to /home/dchtchou/src/ion-pcm/SPARK/SimpleApp/ta
rget/scala-2.10/classes...
[info] Packaging /home/dchtchou/src/ion-pcm/SPARK/SimpleApp/target/scala-2.10/si
mple-project_2.10-1.0.jar ...
[info] Done packaging.
[success] Total time: 6 s, completed Jul 30, 2014 5:50:35 PM
14/07/30 17:50:36 INFO SecurityManager: Changing view acls to: dchtchou
Compiled ok, but getting:
14/07/30 17:50:40 INFO SparkContext: Job finished: count at SimpleApp.scala:15,
took 0.025717261 s
Lines with a: 1, Lines with b: 0
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/
hive/HiveContext
at SimpleApp$.main(SimpleApp.scala:21)
at SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.hive.HiveConte
xt
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 9 more
will re-sart of spark role help?
File /mnt/vol1/var/log/spark/spark-master-gmo-cl-edge-02.log
nope
put /etc/hive/conf.cloudera.hive/hive-site.xml on /etc/alternatives/spark-conf/
on every node

since this is spark home, we need to replace the spark assembly as well
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/spark
mv /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/spark/assembly/lib/spark
-assembly-1.0.0-cdh5.1.0-hadoop2.3.0-cdh5.1.0.jar /opt/cloudera/parcels/CDH-5.1.
0-1.cdh5.1.0.p0.53/lib/spark/assembly/lib/spark-assembly-1.0.0-cdh5.1.0-hadoop2.
3.0-cdh5.1.0.jar.Original
cp spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0-cdh5.1.0.jar /opt/cloudera/parcels/
CDH-5.1.0-1.cdh5.1.0.p0.53/lib/spark/assembly/lib/
do this by ftp
spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0-cdh5.1.0.jar
did this -no reboot required
scala>
hiveContext.hql("SELECT COUNT(*) FROM bom_table_top").collect().forea
ch(println)
ch(println)
14/07/30 19:15:45 INFO ParseDriver: Parsing command: SELECT COUNT(*) FROM bom_ta
ble_top
14/07/30 19:15:45 INFO ParseDriver: Parse Completed
14/07/30 19:15:45 INFO Analyzer: Max iterations (2) reached for batch MultiInsta
nceRelations
14/07/30 19:15:45 INFO Analyzer: Max iterations (2) reached for batch CaseInsens
itiveAttributeReferences
14/07/30 19:15:45 INFO metastore: Trying to connect to metastore with URI thrift
://gmo-cl-edge-02:9083
14/07/30 19:15:45 INFO metastore: Waiting 1 seconds before next connection attem
pt.
14/07/30 19:15:46 INFO metastore: Connected to metastore.
14/07/30 19:15:47 INFO Analyzer: Max iterations (2) reached for batch Check Anal
ysis
14/07/30 19:15:47 INFO deprecation: mapred.map.tasks is deprecated. Instead, use
mapreduce.job.maps
14/07/30 19:15:47 INFO MemoryStore: ensureFreeSpace(404207) called with curMem=0
, maxMem=278302556
14/07/30 19:15:47 INFO MemoryStore: Block broadcast_0 stored as values in memory
(estimated size 394.7 KB, free 265.0 MB)
14/07/30 19:15:47 INFO SQLContext$$anon$1: Max iterations (2) reached for batch
Add exchange
14/07/30 19:15:47 INFO SQLContext$$anon$1: Max iterations (2) reached for batch
Prepare Expressions
14/07/30 19:15:47 INFO SparkContext: Starting job: collect at SparkPlan.scala:52
14/07/30 19:15:48 INFO FileInputFormat: Total input paths to process : 1
14/07/30 19:15:48 INFO DAGScheduler: Registering RDD 5 (mapPartitions at Exchang
e.scala:69)
14/07/30 19:15:48 INFO DAGScheduler: Got job 0 (collect at SparkPlan.scala:52) w
ith 1 output partitions (allowLocal=false)
14/07/30 19:15:48 INFO DAGScheduler: Final stage: Stage 0(collect at SparkPlan.s
cala:52)
14/07/30 19:15:48 INFO DAGScheduler: Parents of final stage: List(Stage 1)
14/07/30 19:15:48 INFO DAGScheduler: Missing parents: List(Stage 1)
14/07/30 19:15:48 INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[5] at
mapPartitions at Exchange.scala:69), which has no missing parents
14/07/30 19:15:48 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (Ma
pPartitionsRDD[5] at mapPartitions at Exchange.scala:69)

14/07/30 19:15:48 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
14/07/30 19:15:48 INFO TaskSetManager: Re-computing pending task lists.
14/07/30 19:15:48 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, lo
calhost, PROCESS_LOCAL, 5668 bytes)
14/07/30 19:15:48 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, lo
calhost, PROCESS_LOCAL, 5668 bytes)
14/07/30 19:15:48 INFO Executor: Running task 1.0 in stage 1.0 (TID 1)
14/07/30 19:15:48 INFO Executor: Running task 0.0 in stage 1.0 (TID 0)
14/07/30 19:15:48 INFO BlockManager: Found block broadcast_0 locally
14/07/30 19:15:48 INFO BlockManager: Found block broadcast_0 locally
14/07/30 19:15:48 INFO HadoopRDD: Input split: hdfs://gmo-cl-name-01:8020/user/h
ive/warehouse/bom_table_top/part-m-00000:39554203+39554204
14/07/30 19:15:48 INFO HadoopRDD: Input split: hdfs://gmo-cl-name-01:8020/user/h
ive/warehouse/bom_table_top/part-m-00000:0+39554203
14/07/30 19:15:48 INFO deprecation: mapred.tip.id is deprecated. Instead, use ma
preduce.task.id
14/07/30 19:15:48 INFO deprecation: mapred.task.id is deprecated. Instead, use m
apreduce.task.attempt.id
14/07/30 19:15:48 INFO deprecation: mapred.task.is.map is deprecated. Instead, u
se mapreduce.task.ismap
14/07/30 19:15:48 INFO deprecation: mapred.task.partition is deprecated. Instead
, use mapreduce.task.partition
14/07/30 19:15:48 INFO deprecation: mapred.job.id is deprecated. Instead, use ma
preduce.job.id
14/07/30 19:15:49 INFO Executor: Finished task 0.0 in stage 1.0 (TID 0). 1868 by
tes result sent to driver
14/07/30 19:15:49 INFO Executor: Finished task 1.0 in stage 1.0 (TID 1). 1868 by
tes result sent to driver
14/07/30 19:15:49 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in
1338 ms on localhost (1/2)
14/07/30 19:15:49 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in
1330 ms on localhost (2/2)
14/07/30 19:15:49 INFO DAGScheduler: Stage 1 (mapPartitions at Exchange.scala:69
) finished in 1.356 s
14/07/30 19:15:49 INFO DAGScheduler: looking for newly runnable stages
14/07/30 19:15:49 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have
all completed, from pool
14/07/30 19:15:49 INFO DAGScheduler: running: Set()
14/07/30 19:15:49 INFO DAGScheduler: waiting: Set(Stage 0)
14/07/30 19:15:49 INFO DAGScheduler: failed: Set()
14/07/30 19:15:49 INFO DAGScheduler: Missing parents for Stage 0: List()
14/07/30 19:15:49 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[9] at map at
SparkPlan.scala:52), which is now runnable
14/07/30 19:15:49 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (Ma
ppedRDD[9] at map at SparkPlan.scala:52)
14/07/30 19:15:49 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/07/30 19:15:49 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 2, lo
calhost, PROCESS_LOCAL, 5843 bytes)
14/07/30 19:15:49 INFO Executor: Running task 0.0 in stage 0.0 (TID 2)
14/07/30 19:15:49 INFO BlockManager: Found block broadcast_0 locally
14/07/30 19:15:49 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesI
nFlight: 50331648, targetRequestSize: 10066329
14/07/30 19:15:49 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 2
non-empty blocks out of 2 blocks
14/07/30 19:15:49 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0
remote fetches in 6 ms
14/07/30 19:15:49 INFO Executor: Finished task 0.0 in stage 0.0 (TID 2). 1076 by
tes result sent to driver
14/07/30 19:15:49 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 2) in
93 ms on localhost (1/1)

14/07/30 19:15:49 INFO DAGScheduler: Stage 0 (collect at SparkPlan.scala:52) fin


ished in 0.094 s
14/07/30 19:15:49 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
all completed, from pool
14/07/30 19:15:49 INFO SparkContext: Job finished: collect at SparkPlan.scala:52
, took 2.344278732 s
[249340]
scala>
sweet!
spark://HOST:PORT
Connect to the given Spark standalone cluster master. Th
e port must be whichever one your master is configured to use, which is 7077 by
default.
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
hiveContext.hql("SELECT LEVEL, COUNT(*) AS level_counts FROM dchtchou_bom_ta
ble GROUP BY LEVEL").collect().foreach(println)
spark-shell --master spark://gmo-cl-edge-02:7077
14/07/30 19:27:34 INFO SparkILoop: Created spark context..
Spark context available as sc.
scala> 14/07/30 19:27:34 INFO SparkDeploySchedulerBackend: Connected to Spark cl
uster with app ID app-20140730192734-0000
change both master and worker memeory to 4g (CM)

scala>
hiveContext.hql("SELECT LEVEL, COUNT(*) AS level_counts FROM dchtchou
_bom_table GROUP BY LEVEL").collect().foreach(println)
_bom_table GROUP BY LEVEL").collect().foreach(println)
14/07/30 19:40:35 INFO ParseDriver: Parsing command: SELECT LEVEL, COUNT(*) AS l
evel_counts FROM dchtchou_bom_table GROUP BY LEVEL
14/07/30 19:40:36 INFO ParseDriver: Parse Completed
14/07/30 19:40:36 INFO Analyzer: Max iterations (2) reached for batch MultiInsta
nceRelations
14/07/30 19:40:36 INFO Analyzer: Max iterations (2) reached for batch CaseInsens
itiveAttributeReferences
14/07/30 19:40:36 INFO metastore: Trying to connect to metastore with URI thrift
://gmo-cl-edge-02:9083
14/07/30 19:40:36 INFO metastore: Waiting 1 seconds before next connection attem
pt.
14/07/30 19:40:37 INFO metastore: Connected to metastore.
14/07/30 19:40:38 INFO Analyzer: Max iterations (2) reached for batch Check Anal
ysis
14/07/30 19:40:38 INFO deprecation: mapred.map.tasks is deprecated. Instead, use
mapreduce.job.maps
14/07/30 19:40:38 INFO MemoryStore: ensureFreeSpace(404207) called with curMem=0
, maxMem=278302556
14/07/30 19:40:38 INFO MemoryStore: Block broadcast_0 stored as values in memory
(estimated size 394.7 KB, free 265.0 MB)
14/07/30 19:40:38 INFO MemoryStore: ensureFreeSpace(56) called with curMem=40420
7, maxMem=278302556
14/07/30 19:40:38 INFO MemoryStore: Block broadcast_0_meta stored as values in m
emory (estimated size 56.0 B, free 265.0 MB)
14/07/30 19:40:38 INFO BlockManagerInfo: Added broadcast_0_meta in memory on gmo

-cl-edge-02:51538 (size: 56.0 B, free: 265.4 MB)


14/07/30 19:40:38 INFO BlockManagerMaster: Updated info of block broadcast_0_met
a
14/07/30 19:40:38 INFO MemoryStore: ensureFreeSpace(90560) called with curMem=40
4263, maxMem=278302556
14/07/30 19:40:38 INFO MemoryStore: Block broadcast_0_piece0 stored as values in
memory (estimated size 88.4 KB, free 264.9 MB)
14/07/30 19:40:38 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on g
mo-cl-edge-02:51538 (size: 88.4 KB, free: 265.3 MB)
14/07/30 19:40:38 INFO BlockManagerMaster: Updated info of block broadcast_0_pie
ce0
14/07/30 19:40:38 INFO SQLContext$$anon$1: Max iterations (2) reached for batch
Add exchange
14/07/30 19:40:38 INFO SQLContext$$anon$1: Max iterations (2) reached for batch
Prepare Expressions
14/07/30 19:40:39 INFO SparkContext: Starting job: collect at SparkPlan.scala:52
14/07/30 19:40:39 INFO FileInputFormat: Total input paths to process : 1
14/07/30 19:40:39 INFO FileInputFormat: Total input paths to process : 3
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 1
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.17
:50010
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.16
:50010
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.14
:50010
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.17
:50010
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.15
:50010
14/07/30 19:40:40 INFO NetworkTopology: Adding a new node: /default/192.168.1.14
:50010
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:40 INFO FileInputFormat: Total input paths to process : 3
14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.16
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.15
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.14
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.17
:50010
14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.16
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.15
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.14
:50010
14/07/30 19:40:41 INFO NetworkTopology: Adding a new node: /default/192.168.1.17
:50010

14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 3


14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO FileInputFormat: Total input paths to process : 4
14/07/30 19:40:41 INFO DAGScheduler: Registering RDD 63 (mapPartitions at Exchan
ge.scala:44)
14/07/30 19:40:41 INFO DAGScheduler: Got job 0 (collect at SparkPlan.scala:52) w
ith 200 output partitions (allowLocal=false)
14/07/30 19:40:41 INFO DAGScheduler: Final stage: Stage 0(collect at SparkPlan.s
cala:52)
14/07/30 19:40:41 INFO DAGScheduler: Parents of final stage: List(Stage 1)
14/07/30 19:40:41 INFO DAGScheduler: Missing parents: List(Stage 1)
14/07/30 19:40:41 INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[63] at
mapPartitions at Exchange.scala:44), which has no missing parents
14/07/30 19:40:41 INFO DAGScheduler: Submitting 1095 missing tasks from Stage 1
(MapPartitionsRDD[63] at mapPartitions at Exchange.scala:44)
14/07/30 19:40:41 INFO TaskSchedulerImpl: Adding task set 1.0 with 1095 tasks
14/07/30 19:40:41 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 0, gm
o-cl-data-04, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, gm
o-cl-data-03, NODE_LOCAL, 18009 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, gm
o-cl-data-02, NODE_LOCAL, 18009 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 3, gm
o-cl-data-01, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 4, gm
o-cl-data-04, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 5, gm
o-cl-data-03, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 6, gm
o-cl-data-02, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 7, gm
o-cl-data-01, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 8, gm
o-cl-data-04, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 9, g
mo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 10, g
mo-cl-data-02, NODE_LOCAL, 18008 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 11,
gmo-cl-data-01, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 12,
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 13,
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 16.0 in stage 1.0 (TID 14,
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:41 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 15,
gmo-cl-data-01, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:42 INFO ConnectionManager: Accepted connection from [gmo-cl-data03/192.168.1.16]
14/07/30 19:40:42 INFO ConnectionManager: Accepted connection from [gmo-cl-data04/192.168.1.17]
14/07/30 19:40:42 INFO ConnectionManager: Accepted connection from [gmo-cl-data02/192.168.1.15]
14/07/30 19:40:42 INFO SendingConnection: Initiating connection to [gmo-cl-data04/192.168.1.17:37802]
14/07/30 19:40:42 INFO SendingConnection: Initiating connection to [gmo-cl-data03/192.168.1.16:60851]
14/07/30 19:40:42 INFO SendingConnection: Initiating connection to [gmo-cl-data-

02/192.168.1.15:48984]
14/07/30 19:40:42 INFO SendingConnection: Connected to [gmo-cl-data-04/192.168.1
.17:37802], 1 messages pending
14/07/30 19:40:42 INFO SendingConnection: Connected to [gmo-cl-data-03/192.168.1
.16:60851], 1 messages pending
14/07/30 19:40:42 INFO SendingConnection: Connected to [gmo-cl-data-02/192.168.1
.15:48984], 1 messages pending
14/07/30 19:40:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on g
mo-cl-data-04:37802 (size: 88.4 KB, free: 265.3 MB)
14/07/30 19:40:42 INFO ConnectionManager: Accepted connection from [gmo-cl-data01/192.168.1.14]
14/07/30 19:40:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on g
mo-cl-data-03:60851 (size: 88.4 KB, free: 265.3 MB)
14/07/30 19:40:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on g
mo-cl-data-02:48984 (size: 88.4 KB, free: 265.3 MB)
14/07/30 19:40:42 INFO SendingConnection: Initiating connection to [gmo-cl-data01/192.168.1.14:49107]
14/07/30 19:40:42 INFO SendingConnection: Connected to [gmo-cl-data-01/192.168.1
.14:49107], 1 messages pending
14/07/30 19:40:43 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on g
mo-cl-data-01:49107 (size: 88.4 KB, free: 265.3 MB)
14/07/30 19:40:46 INFO TaskSetManager: Starting task 15.0 in stage 1.0 (TID 16,
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:46 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 0) in
4833 ms on gmo-cl-data-04 (1/1095)
14/07/30 19:40:47 INFO TaskSetManager: Starting task 17.0 in stage 1.0 (TID 17,
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:47 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in
5593 ms on gmo-cl-data-03 (2/1095)
14/07/30 19:40:47 INFO TaskSetManager: Starting task 19.0 in stage 1.0 (TID 18,
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:47 INFO TaskSetManager: Starting task 22.0 in stage 1.0 (TID 19,
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:47 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in
5607 ms on gmo-cl-data-02 (3/1095)
14/07/30 19:40:47 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 6) in
5597 ms on gmo-cl-data-02 (4/1095)
14/07/30 19:40:47 INFO TaskSetManager: Starting task 18.0 in stage 1.0 (TID 20,
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:47 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 12)
in 5842 ms on gmo-cl-data-04 (5/1095)
14/07/30 19:40:47 INFO TaskSetManager: Starting task 20.0 in stage 1.0 (TID 21,
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:47 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 8) in
5977 ms on gmo-cl-data-04 (6/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting task 23.0 in stage 1.0 (TID 22,
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 10) i
n 6129 ms on gmo-cl-data-02 (7/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting task 24.0 in stage 1.0 (TID 23,
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished task 16.0 in stage 1.0 (TID 14)
in 6329 ms on gmo-cl-data-02 (8/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting task 21.0 in stage 1.0 (TID 24,
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 4) in
6372 ms on gmo-cl-data-04 (9/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting task 25.0 in stage 1.0 (TID 25,
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 9) i

n 6508 ms on gmo-cl-data-03 (10/1095)


14/07/30 19:40:48 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished
6524 ms on gmo-cl-data-03 (11/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished
in 1857 ms on gmo-cl-data-04 (12/1095)
14/07/30 19:40:48 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:48 INFO TaskSetManager: Finished
in 6759 ms on gmo-cl-data-03 (13/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1498 ms on gmo-cl-data-02 (14/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1275 ms on gmo-cl-data-04 (15/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1196 ms on gmo-cl-data-02 (16/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1826 ms on gmo-cl-data-02 (17/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 2016 ms on gmo-cl-data-03 (18/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1750 ms on gmo-cl-data-04 (19/1095)
14/07/30 19:40:49 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:49 INFO TaskSetManager: Finished
in 1356 ms on gmo-cl-data-04 (20/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1718 ms on gmo-cl-data-04 (21/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1773 ms on gmo-cl-data-02 (22/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1715 ms on gmo-cl-data-03 (23/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1780 ms on gmo-cl-data-03 (24/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished

task 26.0 in stage 1.0 (TID 26,


task 4.0 in stage 1.0 (TID 5) in
task 27.0 in stage 1.0 (TID 27,
task 15.0 in stage 1.0 (TID 16)
task 28.0 in stage 1.0 (TID 28,
task 13.0 in stage 1.0 (TID 13)
task 29.0 in stage 1.0 (TID 29,
task 22.0 in stage 1.0 (TID 19)
task 30.0 in stage 1.0 (TID 30,
task 18.0 in stage 1.0 (TID 20)
task 31.0 in stage 1.0 (TID 31,
task 23.0 in stage 1.0 (TID 22)
task 32.0 in stage 1.0 (TID 32,
task 19.0 in stage 1.0 (TID 18)
task 33.0 in stage 1.0 (TID 33,
task 17.0 in stage 1.0 (TID 17)
task 35.0 in stage 1.0 (TID 34,
task 20.0 in stage 1.0 (TID 21)
task 37.0 in stage 1.0 (TID 35,
task 27.0 in stage 1.0 (TID 27)
task 38.0 in stage 1.0 (TID 36,
task 21.0 in stage 1.0 (TID 24)
task 34.0 in stage 1.0 (TID 37,
task 24.0 in stage 1.0 (TID 23)
task 36.0 in stage 1.0 (TID 38,
task 26.0 in stage 1.0 (TID 26)
task 39.0 in stage 1.0 (TID 39,
task 25.0 in stage 1.0 (TID 25)
task 40.0 in stage 1.0 (TID 40,
task 29.0 in stage 1.0 (TID 29)

in 1701 ms on gmo-cl-data-02 (25/1095)


14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1685 ms on gmo-cl-data-04 (26/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 2060 ms on gmo-cl-data-03 (27/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1655 ms on gmo-cl-data-02 (28/1095)
14/07/30 19:40:50 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:50 INFO TaskSetManager: Finished
in 1650 ms on gmo-cl-data-02 (29/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1704 ms on gmo-cl-data-03 (30/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1615 ms on gmo-cl-data-04 (31/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1569 ms on gmo-cl-data-04 (32/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-04, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1503 ms on gmo-cl-data-04 (33/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1770 ms on gmo-cl-data-02 (34/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1703 ms on gmo-cl-data-03 (35/1095)
14/07/30 19:40:51 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:51 INFO TaskSetManager: Finished
in 1746 ms on gmo-cl-data-03 (36/1095)
14/07/30 19:40:52 INFO TaskSetManager: Starting
gmo-cl-data-03, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:52 INFO TaskSetManager: Finished
in 1733 ms on gmo-cl-data-03 (37/1095)
14/07/30 19:40:52 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:52 INFO TaskSetManager: Finished
in 1823 ms on gmo-cl-data-02 (38/1095)
14/07/30 19:40:52 INFO TaskSetManager: Starting
gmo-cl-data-02, NODE_LOCAL, 18007 bytes)
14/07/30 19:40:52 INFO TaskSetManager: Finished
in 1721 ms on gmo-cl-data-02 (39/1095)

task 41.0 in stage 1.0 (TID 41,


task 30.0 in stage 1.0 (TID 30)
task 42.0 in stage 1.0 (TID 42,
task 28.0 in stage 1.0 (TID 28)
task 43.0 in stage 1.0 (TID 43,
task 31.0 in stage 1.0 (TID 31)
task 44.0 in stage 1.0 (TID 44,
task 32.0 in stage 1.0 (TID 32)
task 45.0 in stage 1.0 (TID 45,
task 33.0 in stage 1.0 (TID 33)
task 46.0 in stage 1.0 (TID 46,
task 35.0 in stage 1.0 (TID 34)
task 48.0 in stage 1.0 (TID 47,
task 37.0 in stage 1.0 (TID 35)
task 49.0 in stage 1.0 (TID 48,
task 38.0 in stage 1.0 (TID 36)
task 47.0 in stage 1.0 (TID 49,
task 34.0 in stage 1.0 (TID 37)
task 50.0 in stage 1.0 (TID 50,
task 36.0 in stage 1.0 (TID 38)
task 51.0 in stage 1.0 (TID 51,
task 39.0 in stage 1.0 (TID 39)
task 52.0 in stage 1.0 (TID 52,
task 42.0 in stage 1.0 (TID 42)
task 53.0 in stage 1.0 (TID 53,
task 40.0 in stage 1.0 (TID 40)
task 55.0 in stage 1.0 (TID 54,
task 43.0 in stage 1.0 (TID 43)

14/07/30 19:43:54 INFO TaskSetManager: Finished task 1041.0 in stage 1.0 (TID 10

93) in 2456 ms on gmo-cl-data-02 (1091/1095)


14/07/30 19:43:54 INFO TaskSetManager: Finished task 1017.0 in stage 1.0 (TID 10
92) in 2510 ms on gmo-cl-data-03 (1092/1095)
14/07/30 19:43:55 INFO TaskSetManager: Finished task 735.0 in stage 1.0 (TID 108
9) in 3425 ms on gmo-cl-data-02 (1093/1095)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 875.0 in stage 1.0 (TID 109
1) in 4296 ms on gmo-cl-data-03 (1094/1095)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 1045.0 in stage 1.0 (TID 10
94) in 4288 ms on gmo-cl-data-03 (1095/1095)
14/07/30 19:43:56 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have
all completed, from pool
14/07/30 19:43:56 INFO DAGScheduler: Stage 1 (mapPartitions at Exchange.scala:44
) finished in 194.792 s
14/07/30 19:43:56 INFO DAGScheduler: looking for newly runnable stages
14/07/30 19:43:56 INFO DAGScheduler: running: Set()
14/07/30 19:43:56 INFO DAGScheduler: waiting: Set(Stage 0)
14/07/30 19:43:56 INFO DAGScheduler: failed: Set()
14/07/30 19:43:56 INFO DAGScheduler: Missing parents for Stage 0: List()
14/07/30 19:43:56 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[67] at map at
SparkPlan.scala:52), which is now runnable
14/07/30 19:43:56 INFO DAGScheduler: Submitting 200 missing tasks from Stage 0 (
MappedRDD[67] at map at SparkPlan.scala:52)
14/07/30 19:43:56 INFO TaskSchedulerImpl: Adding task set 0.0 with 200 tasks
14/07/30 19:43:56 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 1095,
gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1096,
gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 1097,
gmo-cl-data-01, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 1098,
gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 1099,
gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 1100,
gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 1101,
gmo-cl-data-01, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 1102,
gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 1103,
gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 1104,
gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 1105
, gmo-cl-data-01, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 1106
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 1107
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 1108
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 1109
, gmo-cl-data-01, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 1110
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO MapOutputTrackerMasterActor: Asked to send map output loc
ations for shuffle 0 to spark@gmo-cl-data-02:39656
14/07/30 19:43:56 INFO MapOutputTrackerMaster: Size of output statuses for shuff
le 0 is 2766 bytes
14/07/30 19:43:56 INFO MapOutputTrackerMasterActor: Asked to send map output loc

ations for shuffle 0 to spark@gmo-cl-data-03:37009


14/07/30 19:43:56 INFO MapOutputTrackerMasterActor: Asked to send map output loc
ations for shuffle 0 to spark@gmo-cl-data-04:42405
14/07/30 19:43:56 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 1111
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 1112
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 1113
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 1095)
in 224 ms on gmo-cl-data-02 (1/200)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 1103)
in 228 ms on gmo-cl-data-02 (2/200)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 1106
) in 228 ms on gmo-cl-data-03 (3/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 1114
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 1107
) in 231 ms on gmo-cl-data-02 (4/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 1115
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 1110
) in 232 ms on gmo-cl-data-03 (5/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 1116
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 1102)
in 236 ms on gmo-cl-data-03 (6/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 1117
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1096)
in 239 ms on gmo-cl-data-04 (7/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 1118
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 1100)
in 238 ms on gmo-cl-data-04 (8/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 1119
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 1108
) in 236 ms on gmo-cl-data-04 (9/200)
14/07/30 19:43:56 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 1120
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:56 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 1098)
in 240 ms on gmo-cl-data-03 (10/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 1121
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 1111
) in 56 ms on gmo-cl-data-02 (11/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 1122
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 17.0 in stage 0.0 (TID 1112
) in 59 ms on gmo-cl-data-02 (12/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 1123
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 1116
) in 47 ms on gmo-cl-data-03 (13/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 1124
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 22.0 in stage 0.0 (TID 1117
) in 62 ms on gmo-cl-data-04 (14/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 1125

, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)


14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 61 ms on gmo-cl-data-04 (15/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 67 ms on gmo-cl-data-03 (16/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 81 ms on gmo-cl-data-03 (17/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 43 ms on gmo-cl-data-02 (18/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 45 ms on gmo-cl-data-02 (19/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 43 ms on gmo-cl-data-04 (20/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 62 ms on gmo-cl-data-04 (21/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 79 ms on gmo-cl-data-03 (22/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 44 ms on gmo-cl-data-02 (23/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 44 ms on gmo-cl-data-02 (24/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 47 ms on gmo-cl-data-04 (25/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 88 ms on gmo-cl-data-03 (26/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 98 ms on gmo-cl-data-03 (27/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 41 ms on gmo-cl-data-02 (28/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 46 ms on gmo-cl-data-04 (29/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting

task 24.0 in stage 0.0 (TID 1119


task 31.0 in stage 0.0 (TID 1126
task 20.0 in stage 0.0 (TID 1115
task 32.0 in stage 0.0 (TID 1127
task 18.0 in stage 0.0 (TID 1113
task 33.0 in stage 0.0 (TID 1128
task 26.0 in stage 0.0 (TID 1121
task 34.0 in stage 0.0 (TID 1129
task 27.0 in stage 0.0 (TID 1122
task 35.0 in stage 0.0 (TID 1130
task 29.0 in stage 0.0 (TID 1124
task 36.0 in stage 0.0 (TID 1131
task 30.0 in stage 0.0 (TID 1125
task 37.0 in stage 0.0 (TID 1132
task 28.0 in stage 0.0 (TID 1123
task 33.0 in stage 0.0 (TID 1128
task 38.0 in stage 0.0 (TID 1133
task 39.0 in stage 0.0 (TID 1134
task 34.0 in stage 0.0 (TID 1129
task 40.0 in stage 0.0 (TID 1135
task 35.0 in stage 0.0 (TID 1130
task 41.0 in stage 0.0 (TID 1136
task 32.0 in stage 0.0 (TID 1127
task 42.0 in stage 0.0 (TID 1137
task 31.0 in stage 0.0 (TID 1126
task 43.0 in stage 0.0 (TID 1138
task 38.0 in stage 0.0 (TID 1133
task 44.0 in stage 0.0 (TID 1139
task 36.0 in stage 0.0 (TID 1131
task 45.0 in stage 0.0 (TID 1140

, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)


14/07/30 19:43:57 INFO TaskSetManager: Finished task 37.0
) in 50 ms on gmo-cl-data-03 (30/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 46.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 42.0
) in 22 ms on gmo-cl-data-03 (31/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 47.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 41.0
) in 33 ms on gmo-cl-data-03 (32/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 48.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 49.0
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 45.0
) in 21 ms on gmo-cl-data-03 (33/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 39.0
) in 64 ms on gmo-cl-data-02 (34/200)
14/07/30 19:43:57 INFO MapOutputTrackerMasterActor: Asked
ations for shuffle 0 to spark@gmo-cl-data-01:49590
14/07/30 19:43:57 INFO TaskSetManager: Starting task 50.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 48.0
) in 21 ms on gmo-cl-data-03 (35/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 51.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 47.0
) in 43 ms on gmo-cl-data-03 (36/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 52.0
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 50.0
) in 40 ms on gmo-cl-data-03 (37/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 53.0
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 44.0
) in 124 ms on gmo-cl-data-04 (38/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 54.0
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 40.0
) in 145 ms on gmo-cl-data-04 (39/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 43.0
) in 134 ms on gmo-cl-data-02 (40/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 55.0
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 56.0
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 49.0
) in 112 ms on gmo-cl-data-02 (41/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 57.0
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 55.0
) in 36 ms on gmo-cl-data-02 (42/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 58.0
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 53.0
) in 46 ms on gmo-cl-data-04 (43/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting task 59.0
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished task 54.0

in stage 0.0 (TID 1132


in stage 0.0 (TID 1141
in stage 0.0 (TID 1137
in stage 0.0 (TID 1142
in stage 0.0 (TID 1136
in stage 0.0 (TID 1143
in stage 0.0 (TID 1144
in stage 0.0 (TID 1140
in stage 0.0 (TID 1134
to send map output loc
in stage 0.0 (TID 1145
in stage 0.0 (TID 1143
in stage 0.0 (TID 1146
in stage 0.0 (TID 1142
in stage 0.0 (TID 1147
in stage 0.0 (TID 1145
in stage 0.0 (TID 1148
in stage 0.0 (TID 1139
in stage 0.0 (TID 1149
in stage 0.0 (TID 1135
in stage 0.0 (TID 1138
in stage 0.0 (TID 1150
in stage 0.0 (TID 1151
in stage 0.0 (TID 1144
in stage 0.0 (TID 1152
in stage 0.0 (TID 1150
in stage 0.0 (TID 1153
in stage 0.0 (TID 1148
in stage 0.0 (TID 1154
in stage 0.0 (TID 1149

) in 44 ms on gmo-cl-data-04 (44/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 128 ms on gmo-cl-data-03 (45/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 104 ms on gmo-cl-data-03 (46/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 22 ms on gmo-cl-data-02 (47/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 58 ms on gmo-cl-data-02 (48/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 41 ms on gmo-cl-data-04 (49/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 27 ms on gmo-cl-data-03 (50/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 55 ms on gmo-cl-data-04 (51/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 37 ms on gmo-cl-data-02 (52/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 36 ms on gmo-cl-data-02 (53/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 21 ms on gmo-cl-data-04 (54/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 46 ms on gmo-cl-data-03 (55/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 21 ms on gmo-cl-data-03 (56/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 31 ms on gmo-cl-data-02 (57/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 35 ms on gmo-cl-data-04 (58/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished

task 60.0 in stage 0.0 (TID 1155


task 51.0 in stage 0.0 (TID 1146
task 61.0 in stage 0.0 (TID 1156
task 52.0 in stage 0.0 (TID 1147
task 62.0 in stage 0.0 (TID 1157
task 57.0 in stage 0.0 (TID 1152
task 63.0 in stage 0.0 (TID 1158
task 56.0 in stage 0.0 (TID 1151
task 64.0 in stage 0.0 (TID 1159
task 59.0 in stage 0.0 (TID 1154
task 65.0 in stage 0.0 (TID 1160
task 60.0 in stage 0.0 (TID 1155
task 66.0 in stage 0.0 (TID 1161
task 58.0 in stage 0.0 (TID 1153
task 67.0 in stage 0.0 (TID 1162
task 62.0 in stage 0.0 (TID 1157
task 68.0 in stage 0.0 (TID 1163
task 63.0 in stage 0.0 (TID 1158
task 69.0 in stage 0.0 (TID 1164
task 64.0 in stage 0.0 (TID 1159
task 70.0 in stage 0.0 (TID 1165
task 61.0 in stage 0.0 (TID 1156
task 71.0 in stage 0.0 (TID 1166
task 65.0 in stage 0.0 (TID 1160
task 72.0 in stage 0.0 (TID 1167
task 67.0 in stage 0.0 (TID 1162
task 73.0 in stage 0.0 (TID 1168
task 66.0 in stage 0.0 (TID 1161
task 74.0 in stage 0.0 (TID 1169
task 71.0 in stage 0.0 (TID 1166

) in 26 ms on gmo-cl-data-03 (59/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 39 ms on gmo-cl-data-03 (60/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 267 ms on gmo-cl-data-03 (61/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
in 690 ms on gmo-cl-data-02 (62/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 54 ms on gmo-cl-data-02 (63/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 22 ms on gmo-cl-data-02 (64/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 48 ms on gmo-cl-data-04 (65/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 76 ms on gmo-cl-data-04 (66/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 48 ms on gmo-cl-data-03 (67/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 28 ms on gmo-cl-data-03 (68/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 25 ms on gmo-cl-data-02 (69/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 23 ms on gmo-cl-data-03 (70/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 60 ms on gmo-cl-data-03 (71/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 77 ms on gmo-cl-data-02 (72/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 35 ms on gmo-cl-data-02 (73/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished

task 70.0 in stage 0.0 (TID 1165


task 75.0 in stage 0.0 (TID 1170
task 76.0 in stage 0.0 (TID 1171
task 46.0 in stage 0.0 (TID 1141
task 77.0 in stage 0.0 (TID 1172
task 4.0 in stage 0.0 (TID 1099)
task 78.0 in stage 0.0 (TID 1173
task 68.0 in stage 0.0 (TID 1163
task 79.0 in stage 0.0 (TID 1174
task 80.0 in stage 0.0 (TID 1175
task 77.0 in stage 0.0 (TID 1172
task 81.0 in stage 0.0 (TID 1176
task 73.0 in stage 0.0 (TID 1168
task 82.0 in stage 0.0 (TID 1177
task 69.0 in stage 0.0 (TID 1164
task 74.0 in stage 0.0 (TID 1169
task 76.0 in stage 0.0 (TID 1171
task 83.0 in stage 0.0 (TID 1178
task 84.0 in stage 0.0 (TID 1179
task 78.0 in stage 0.0 (TID 1173
task 82.0 in stage 0.0 (TID 1177
task 85.0 in stage 0.0 (TID 1180
task 86.0 in stage 0.0 (TID 1181
task 75.0 in stage 0.0 (TID 1170
task 87.0 in stage 0.0 (TID 1182
task 72.0 in stage 0.0 (TID 1167
task 88.0 in stage 0.0 (TID 1183
task 80.0 in stage 0.0 (TID 1175
task 89.0 in stage 0.0 (TID 1184
task 84.0 in stage 0.0 (TID 1179

) in 34 ms on gmo-cl-data-02 (74/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 16 ms on gmo-cl-data-03 (75/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
in 779 ms on gmo-cl-data-04 (76/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 35 ms on gmo-cl-data-02 (77/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 74 ms on gmo-cl-data-04 (78/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 41 ms on gmo-cl-data-02 (79/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 550 ms on gmo-cl-data-04 (80/200)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 50 ms on gmo-cl-data-02 (81/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 38 ms on gmo-cl-data-03 (82/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-03, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:57 INFO TaskSetManager: Finished
) in 564 ms on gmo-cl-data-03 (83/200)
14/07/30 19:43:57 INFO TaskSetManager: Starting
, gmo-cl-data-04, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:58 INFO TaskSetManager: Finished
9) in 56 ms on gmo-cl-data-04 (183/200)
14/07/30 19:43:58 INFO TaskSetManager: Starting
4, gmo-cl-data-02, PROCESS_LOCAL, 8734 bytes)
14/07/30 19:43:58 INFO TaskSetManager: Finished
3) in 39 ms on gmo-cl-data-02 (184/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
4) in 37 ms on gmo-cl-data-04 (185/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
) in 536 ms on gmo-cl-data-04 (186/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
2) in 27 ms on gmo-cl-data-03 (187/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
5) in 48 ms on gmo-cl-data-01 (188/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
8) in 42 ms on gmo-cl-data-02 (189/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
9) in 43 ms on gmo-cl-data-02 (190/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished
1) in 46 ms on gmo-cl-data-03 (191/200)

task 90.0 in stage 0.0 (TID 1185


task 86.0 in stage 0.0 (TID 1181
task 91.0 in stage 0.0 (TID 1186
task 9.0 in stage 0.0 (TID 1104)
task 92.0 in stage 0.0 (TID 1187
task 89.0 in stage 0.0 (TID 1184
task 79.0 in stage 0.0 (TID 1174
task 93.0 in stage 0.0 (TID 1188
task 94.0 in stage 0.0 (TID 1189
task 88.0 in stage 0.0 (TID 1183
task 95.0 in stage 0.0 (TID 1190
task 23.0 in stage 0.0 (TID 1118
task 87.0 in stage 0.0 (TID 1182
task 96.0 in stage 0.0 (TID 1191
task 90.0 in stage 0.0 (TID 1185
task 97.0 in stage 0.0 (TID 1192
task 98.0 in stage 0.0 (TID 1193
task 25.0 in stage 0.0 (TID 1120
task 99.0 in stage 0.0 (TID 1194
task 184.0 in stage 0.0 (TID 127
task 199.0 in stage 0.0 (TID 129
task 188.0 in stage 0.0 (TID 128
task 189.0 in stage 0.0 (TID 128
task 91.0 in stage 0.0 (TID 1186
task 197.0 in stage 0.0 (TID 129
task 190.0 in stage 0.0 (TID 128
task 193.0 in stage 0.0 (TID 128
task 194.0 in stage 0.0 (TID 128
task 196.0 in stage 0.0 (TID 129

14/07/30 19:43:58 INFO TaskSetManager: Finished task 199.0 in stage 0.0 (TID 129
4) in 43 ms on gmo-cl-data-02 (192/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 198.0 in stage 0.0 (TID 129
3) in 49 ms on gmo-cl-data-04 (193/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 192.0 in stage 0.0 (TID 128
7) in 74 ms on gmo-cl-data-01 (194/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 195.0 in stage 0.0 (TID 129
0) in 73 ms on gmo-cl-data-03 (195/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 191.0 in stage 0.0 (TID 128
6) in 372 ms on gmo-cl-data-01 (196/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 138.0 in stage 0.0 (TID 123
3) in 641 ms on gmo-cl-data-01 (197/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 104.0 in stage 0.0 (TID 119
9) in 939 ms on gmo-cl-data-02 (198/200)
14/07/30 19:43:58 INFO TaskSetManager: Finished task 137.0 in stage 0.0 (TID 123
2) in 793 ms on gmo-cl-data-04 (199/200)
14/07/30 19:43:59 INFO TaskSetManager: Finished task 165.0 in stage 0.0 (TID 126
0) in 1605 ms on gmo-cl-data-03 (200/200)
14/07/30 19:43:59 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have
all completed, from pool
14/07/30 19:43:59 INFO DAGScheduler: Stage 0 (collect at SparkPlan.scala:52) fin
ished in 2.771 s
14/07/30 19:43:59 INFO SparkContext: Job finished: collect at SparkPlan.scala:52
, took 200.469593935 s
[16,476377]
[3,2220985]
[10,11041237]
[14,1852499]
[4,4947681]
[13,1353556]
[12,1686848]
[17,15441]
[15,1771336]
[18,3537]
[6,16810306]
[2,764589]
[7,49529326]
[19,5143]
[11,4848319]
[5,7586226]
[8,29843447]
[20,1743]
[9,123893626]
[1,249340]
scala>
SparkSQL can't do functions yet like concat...have to program it in Spark :)
14/08/01 03:49:46 INFO ParseDriver: Parsing command: SELECT
ASSEMBLY_NAME
,ASSEMBLY_DESC
,ASSEMBLY_PF
,ASSEMBLY_BU
,ASSEMBLY_TG
,ASSEMBLY_ITEM_ID
,BILL_SEQUENCE_ID
,COMPONENT_NAME
,COMPONENT_DESC

,COMPONENT_PF
,COMPONENT_BU
,COMPONENT_TG
,COMPONENT_ITEM_ID
,COMPONENT_QUANTITY
,EFFECTIVITY_DATE
,DISABLE_DATE
,ALTERNATE_BOM_DESIGNATOR
,ORGANIZATION_ID
,ORGANIZATION_CODE
,TEST_COMMENT
,concat('<i>',table_top.COMPONENT_NAME,'</i>') AS PATH
,1 AS LEVEL
,table_top.COMPONENT_NAME AS LEVEL_1
,LEVEL_2
,LEVEL_3
,LEVEL_4
,LEVEL_5
,LEVEL_6
,LEVEL_7
,LEVEL_8
,LEVEL_9
,LEVEL_10
,LEVEL_11
,LEVEL_12
,LEVEL_13
,LEVEL_14
,LEVEL_15
,LEVEL_16
,LEVEL_17
,LEVEL_18
,LEVEL_19
,LEVEL_20
,LEVEL_21
,LEVEL_22
,LEVEL_23
,LEVEL_24
,LEVEL_25
,LEVEL_26
,LEVEL_27
,LEVEL_28
,LEVEL_29
,LEVEL_30
,1 as level_partition
FROM bom_table_top
14/08/01 03:49:46 INFO ParseDriver: Parse Completed
14/08/01 03:49:47 INFO SparkDeploySchedulerBackend: Registered executor: Actor[a
kka.tcp://sparkExecutor@gmo-cl-data-03:41181/user/Executor#1938503466] with ID 0
14/08/01 03:49:47 INFO SparkDeploySchedulerBackend: Registered executor: Actor[a
kka.tcp://sparkExecutor@gmo-cl-data-04:37881/user/Executor#2064189298] with ID 3
14/08/01 03:49:47 INFO BlockManagerMasterActor: Registering block manager gmo-cl
-data-03:59963 with 2.1 GB RAM
14/08/01 03:49:47 INFO SparkDeploySchedulerBackend: Registered executor: Actor[a
kka.tcp://sparkExecutor@gmo-cl-data-02:49467/user/Executor#114880777] with ID 2
14/08/01 03:49:47 INFO BlockManagerMasterActor: Registering block manager gmo-cl
-data-04:57370 with 2.1 GB RAM
14/08/01 03:49:47 INFO Analyzer: Max iterations (2) reached for batch MultiInsta
nceRelations

14/08/01 03:49:47 INFO Analyzer: Max iterations (2) reached for batch CaseInsens
itiveAttributeReferences
14/08/01 03:49:47 INFO BlockManagerMasterActor: Registering block manager gmo-cl
-data-02:51039 with 2.1 GB RAM
14/08/01 03:49:47 INFO metastore: Trying to connect to metastore with URI thrift
://gmo-cl-edge-02:9083
14/08/01 03:49:47 INFO metastore: Waiting 1 seconds before next connection attem
pt.
14/08/01 03:49:48 INFO metastore: Connected to metastore.
Exception in thread "main" org.apache.spark.sql.catalyst.errors.package$TreeNode
Exception: Unresolved attributes: 'concat(<i>,'table_top.component_name,</i>) AS
path#4,'table_top.component_name AS level_1#6, tree:
Project [assembly_name#8,assembly_desc#9,assembly_pf#10,assembly_bu#11,assembly_
tg#12,assembly_item_id#13,bill_sequence_id#14,component_name#15,component_desc#1
6,component_pf#17,component_bu#18,component_tg#19,component_item_id#20,component
_quantity#21,effectivity_date#22,disable_date#23,alternate_bom_designator#24,org
anization_id#25,organization_code#26,test_comment#27,'concat(<i>,'table_top.comp
onent_name,</i>) AS path#4,1 AS level#5,'table_top.component_name AS level_1#6,l
evel_2#31,level_3#32,level_4#33,level_5#34,level_6#35,level_7#36,level_8#37,leve
l_9#38,level_10#39,level_11#40,level_12#41,level_13#42,level_14#43,level_15#44,l
evel_16#45,level_17#46,level_18#47,level_19#48,level_20#49,level_21#50,level_22#
51,level_23#52,level_24#53,level_25#54,level_26#55,level_27#56,level_28#57,level
_29#58,level_30#59,1 AS level_partition#3]
LowerCaseSchema
MetastoreRelation default, bom_table_top, None
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anon
fun$apply$1.applyOrElse(Analyzer.scala:71)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anon
fun$apply$1.applyOrElse(Analyzer.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.s
cala:165)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala
:156)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.appl
y(Analyzer.scala:69)
at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.appl
y(Analyzer.scala:67)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$an
onfun$apply$2.apply(RuleExecutor.scala:62)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$an
onfun$apply$2.apply(RuleExecutor.scala:60)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.
scala:51)
at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimiz
ed.scala:60)
at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.app
ly(RuleExecutor.scala:60)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.app
ly(RuleExecutor.scala:52)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.s
cala:52)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQ
LContext.scala:317)
at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.sc
ala:317)
at org.apache.spark.sql.hive.HiveContext$QueryExecution.optimizedPlan$lz
ycompute(HiveContext.scala:250)

at org.apache.spark.sql.hive.HiveContext$QueryExecution.optimizedPlan(Hi
veContext.scala:249)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(S
QLContext.scala:320)
at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.s
cala:320)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycomput
e(SQLContext.scala:323)
at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContex
t.scala:323)
at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:428)
at sparksql_shell$.main(sparksql_shell.scala:49)
at sparksql_shell.main(sparksql_shell.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:313)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:73)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Created a table dchthou_bom_table_n77_c7706_fab_02 through CREATE TBALE....AS SE


LECT
set the partition on it now
This works in spark!
SELECT ASSEMBLY_NAME
,concat('<i>',table_top.COMPONENT_NAME,'</i>')
,instr(concat('<i>',table_top.COMPONENT_NAME,'</i>')
,table_top.PATH) AS PATH
FROM bom_table_top as table_top
WHERE COMPONENT_NAME = "N77-C7706-FAB-2"
INSERT wtih PARTITION does not work
sparksql-shell: reading file:BOM_TABLE_LEVEL_JobCheck_LoadScript.hql
sparksql-shell: running query:
SELECT
LEVEL
,COUNT(*) level_counts
FROM dchtchou_bom_table_sparksql
GROUP BY LEVEL
ORDER BY LEVEL
[1,249340]
[2,764588]
[3,2220984]
[4,4947681]
[5,7586226]
[6,16810306]
[7,49529326]
[8,29843447]
[9,123893626]

sparksql-shell: finished in 1.98 min


-- =============================================================================
==================
-- = End level JobCheck duration: 0:02:04.017641, total duration: 0:57:27.331271
-- =============================================================================
==================
-- =============================================================================
==================
-- = Total levels JobCheck duration: 0:57:27.331271
-- =============================================================================
==================
[dchtchou@gmo-cl-edge-02 TestCasePartitionOnLevelTextFile]$ bash: o0: command no
t found
[dchtchou@gmo-cl-edge-02 TestCasePartitionOnLevelTextFile]$ [dchtchou@gmo-cl-edg
e-02 TestCasePartitionOnLevelTextFile]$
1: End level 1 duration: 0:00:19.851785
2: End level 2 duration: 0:01:05.316941
3: End level 3 duration: 0:01:18.682567
4: End level 4 duration: 0:01:43.262963
5: End level 5 duration: 0:02:17.959594
6: End level 6 duration: 0:04:10.404508
7: End level 7 duration: 0:09:28.680896
8: End level 8 duration: 0:07:24.876291
9: End level 9 duration: 0:26:14.202519
10: End level 10 duration: 0:07:06.248668
11: End level 11 duration: 0:04:45.621255
12: End level 12 duration: 0:04:05.014905
13: End level 13 duration: 0:04:04.052752
14: End level 14 duration: 0:04:14.497213
15: End level 15 duration: 0:03:58.408727
16: End level 16 duration: 0:00:06.566186
17: End level 17 duration: 0:00:07.192965
18: End level 18 duration: 0:01:03.132999
19: End level 19 duration: 0:01:06.603037
20: End level 20 duration: 0:01:00.923189
SELECT
LEVEL
,COUNT(*) level_counts
FROM dchtchou_bom_table_sparksql
GROUP BY LEVEL
ORDER BY LEVEL
[1,249340]
[2,764588]
[3,2220984]
[4,4947681]
[5,7586226]
[6,16810306]
[7,49529326]
[8,29843447]
[9,123893626]
[10,11041237]
[11,4848319]
[12,1686848]
[13,1353556]
[14,1852499]
[15,1771336]

[16,476377]
[17,15441]
[18,3537]
[19,5143]
[20,1743]
sparksql-shell: finished in 2.13 min
INSERT INTO dchtchou_bom_pairs_parquet
SELECT * FROM bom_pairs_tsv
See if we can catch bom_pairs_tsv
It's doing this weird thing:
14/08/07 03:58:57 INFO NetworkTopology: Adding a new node: /default/192.168.1.14
:50010
14/08/07 03:58:58 INFO DAGScheduler: Registering RDD 5 (mapPartitions at Exchang
e.scala:69)
14/08/07 03:58:58 INFO DAGScheduler: Got job 0 (collect at SparkPlan.scala:52) w
ith 1 output partitions (allowLocal=false)
14/08/07 03:58:58 INFO DAGScheduler: Final stage: Stage 0(collect at SparkPlan.s
cala:52)
14/08/07 03:58:58 INFO DAGScheduler: Parents of final stage: List(Stage 1)
14/08/07 03:58:58 INFO DAGScheduler: Missing parents: List(Stage 1)
14/08/07 03:58:58 INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[5] at
mapPartitions at Exchange.scala:69), which has no missing parents
14/08/07 03:58:59 INFO DAGScheduler: Submitting 4394 missing tasks from Stage 1
(MapPartitionsRDD[5] at mapPartitions at Exchange.scala:69)
14/08/07 03:58:59 INFO TaskSchedulerImpl: Adding task set 1.0 with 4394 tasks
14/08/07 03:59:14 WARN TaskSchedulerImpl: Initial job has not accepted any resou
rces; check your cluster UI to ensure that workers are registered and have suffi
cient memory
[dchtchou@gmo-cl-edge-02 sparksql-shell]$ /home/dchtchou/src/ion-pcm/src/spark/s
parksql-shell/sparksql-shell.sh -q "SELECT COUNT(*) FROM dchtchou_bom_table_spar
ksql" 2>log.txt
sparksql-shell: using context:org.apache.spark.SparkContext@1629aeb2
sparksql-shell: running query:
SELECT COUNT(*) FROM dchtchou_bom_table_sparksql
C-c C-c[dchtchou@gmo-cl-edge-02 sparksql-shell]$ /home/dchtchou/src/ion-pcm/sr
c/spark/sparksql-shell/sparksql-shell.sh -q "SELECT COUNT(*) FROM dchtchou_bom_t
able_sparksql" 2>log.txt
sparksql-shell: using context:org.apache.spark.SparkContext@7a86b09d
sparksql-shell: running query:
SELECT COUNT(*) FROM dchtchou_bom_table_sparksql
C-c C-c[dchtchou@gmo-cl-edge-02 sparksql-shell]$

Das könnte Ihnen auch gefallen