## Lecture Location and Time

Room 307, Tongbo Building（通博楼）. Period **10** to Period **11**, every Tuesday in the evening from Week **1** to Week **17**.

## Syllabus

- W1: Introducation to Big Data Processing
- W2: Data Replication
- W3: Data Partitioning
- W4: Hadoop + HDFS
- W5: Hadoop + HDFS
- W6: Resource Manager
- W7: Columned-oriented Database
- W8: Hive + HBase
- W9: Students’ Presentations
- W10: In-memory KV Database
- W11: Spark + Spark SQL
- W12: Spark + Spark SQL
- W13: Spark + Spark SQL
- W14: Algorithms for Big Data
- W15: Algorithms for Big Data
- W16: Algorithms for Big Data
- W17: Students’ Presentations for Final Projects

## Final Project

Choose anyone from the followings:

### 1. Paper Reviews

At least 5 papers from top-tier conferences or journals in recent 5 years.

### 2. Compute Cumulative Sum

The `cumulative sum`

(or prefix-sum) operator takes an array \(a_1, a_2, \dots, a_n\) and returns an array \(s_1, s_2, \dots, s_n\) where \(s_i = \sum_{j \leq i}a_j\). For example starting with array **17 0 5 32**, it returns **17 17 22 54**.

Describe how to implement `cumulative sum`

in MapReduce, and implement your idea with either `Spark`

or `Hadoop`

.