Stories

Detail Return Return

【趙渝強老師】Scala編程語言 - Stories Detail

Scala是一種多範式的編程語言,其設計的初衷是要集成面向對象編程和函數式編程的各種特性。Scala運行於Java平台(Java虛擬機)之上,併兼容現有的Java程序。因此,要安裝Scala環境之前,首先需要安裝Java的JDK。學習Scala編程語言,將為後續學習Spark和Flink奠定基礎。視頻講解如下:
https://www.bilibili.com/video/BV1wdUWYeEcS/?aid=113503537076...

下面的代碼展示了在Spark中如何使用Scala開發一個WordCount程序。

package demo

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.log4j.Logger
import org.apache.log4j.Level

object WordCountDemo {
  def main(args: Array[String]): Unit = {
    Logger.getLogger("org.apache.spark").setLevel(Level.ERROR)
    Logger.getLogger("org.eclipse.jetty.server").setLevel(Level.OFF)
    
    //本地模式
    //val conf = new SparkConf().setAppName("WordCountDemo").setMaster("local")
    
    //集羣模式
    val conf = new SparkConf().setAppName("WordCountDemo")
    
    //創建SparkContext
    val sc = new SparkContext(conf)
    
    val result = sc.textFile("hdfs://bigdata111:9000/input/data.txt")
                    .flatMap(_.split(" "))
                    .map((_,1))
                    .reduceByKey(_+_)
                    
    //輸出到屏幕
    result.collect.foreach(println)
    
    //輸出到HDFS
    result.saveAsTextFile("hdfs://bigdata111:9000/output/spark/wc")
    
    sc.stop
  }
}

在Flink中也可以使用Scala編程語言,下面的代碼也將在Flink中執行一個WordCount程序。

package demo

import org.apache.flink.api.scala._

object WordCount {
    def main(args: Array[String]) {

    val env = ExecutionEnvironment.getExecutionEnvironment
    val text = env.fromElements("I love Beijing","I love China",
                                "Beijing is the capital of China")
                                
    val counts = text.flatMap(_.split(" ")).map((_,1)).groupBy(0).sum(1)
    counts.print()
  }
}
user avatar ljc1212 Avatar
Favorites 1 users favorite the story!
Favorites

Add a new Comments

Some HTML is okay.