1.可变集合和不可变集合
举例:
定义一个不可变集合:元素的值不可以改变
scala> val math = scala.collection.immutable.Map("Tom"->80,"Mary"-
>90,"Mike"->95)
math: scala.collection.immutable.Map[String,Int] = Map(Tom -> 80, Mary ->
90, Mike -> 95)
定义一个可变集合:
scala> val math = scala.collection.mutable.Map("Tom"->80,"Mary"->90,"Mike"-
>95)
math: scala.collection.mutable.Map[String,Int] = Map(Mike -> 95, Tom -> 80,
Mary -> 90)
操作:
scala> math.get("Tom")
res43: Option[Int] = Some(80)
scala> math.get("Tomlkdfjlskjdfl")
res44: Option[Int] = None
scala> math("Tom")
res45: Int = 80
scala> math("Todsfjlskjdflm")
java.util.NoSuchElementException: key not found: Todsfjlskjdflm
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
... 32 elided
scala> math.contains("Todsfjlskjdflm")
res47: Boolean = false
scala> math.contains("Tom")
res48: Boolean = true
scala> math.getOrElse("Tom",-1)
res49: Int = 80
scala> math.getOrElse("Tsdlfjlskdjflom",-1)
res50: Int = -1
更新集合中的值:
注意:必须是可变集合
scala> math
res51: scala.collection.mutable.Map[String,Int] = Map(Mike -> 95, Tom -> 80,
Mary -> 90)
scala> math("Tom")=59
scala> math
res53: scala.collection.mutable.Map[String,Int] = Map(Mike -> 95, Tom -> 59,
Mary -> 90)
如果没有修改成功,则需要:
scala> import scala.collection.mutable._
import scala.collection.mutable._
添加新的元素
scala> math += "Bob" -> 95
res54: math.type = Map(Bob -> 95, Mike -> 95, Tom -> 59, Mary -> 90)
scala> math -= "Bob"
res55: math.type = Map(Mike -> 95, Tom -> 59, Mary -> 90)
2.列表:可变列表LinkedList 不可变列表 List
scala> val nameList = List("Tom","Andy")
nameList: List[String] = List(Tom, Andy)
scala> val intList = List(1,2,3)
intList: List[Int] = List(1, 2, 3)
scala> val nullList = List()
nullList: List[Nothing] = List()
scala> val dim : List[List[Int]] = List(List(1,2,3),List(4,5,6))
dim: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6))
scala> val dim : List[List[Int]] = List(List(1,2,3),List(4,5))
dim: List[List[Int]] = List(List(1, 2, 3), List(4, 5))
scala> nameList.head
res1: String = Tom
scala> nameList.tail
res2: List[String] = List(Andy)
scala> val nameList = List("Tom","Andy","Lily")
nameList: List[String] = List(Tom, Andy, Lily)
scala> nameList.tail
res3: List[String] = List(Andy, Lily)
tail函数返回的不是最后一个元素,是除去第一个元素后,剩下的元素
LinkedList:
scala> var myList = scala.collection.mutable.LinkedList(1,2,3,4,5)
warning: there was one deprecation warning; re-run with -deprecation for
details
myList: scala.collection.mutable.LinkedList[Int] = LinkedList(1, 2, 3, 4, 5)
scala> myList.map(_*2)
warning: there was one deprecation warning; re-run with -deprecation for
details
res4: scala.collection.mutable.LinkedList[Int] = LinkedList(2, 4, 6, 8, 10)
使用指针进行遍历:
scala> myList
res5: scala.collection.mutable.LinkedList[Int] = LinkedList(1, 2, 3, 4, 5)
定义一个指针,指向列表开始
scala> var cur = myList
cur: scala.collection.mutable.LinkedList[Int] = LinkedList(1, 2, 3, 4, 5)
scala> while(cur != Nil){
| cur.elem = cur.elem * 2
| cur = cur.next
| }
Nil 相当于 null
None Option中的null
scala> myList
res7: scala.collection.mutable.LinkedList[Int] = LinkedList(2, 4, 6, 8, 10)
3.序列
scala中的序列:Vector Range
Vector是一个带下标的序列,我们可以通过下标(索引号),来访问Vector中的元素
scala> var v = Vector(1,2,3,4,5,6)
v: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5, 6)
scala> v.
++ diff headOption minBy
scanLeft toIterable
++: distinct indexOf mkString
scanRight toIterator
+: drop indexOfSlice nonEmpty
segmentLength toList
/: dropRight indexWhere orElse seq
toMap
:+ dropWhile indices padTo size
toSeq
:\ endsWith init par slice
toSet
WithFilter equals inits partition
sliding toStream
addString exists intersect patch
sortBy toString
aggregate filter isDefinedAt permutations
sortWith toTraversable
andThen filterNot isEmpty prefixLength
sorted toVector
apply find isTraversableAgain product span
transpose
applyOrElse flatMap iterator reduce
splitAt union
canEqual flatten last reduceLeft
startsWith unzip
collect fold lastIndexOf reduceLeftOption
stringPrefix unzip3
collectFirst foldLeft lastIndexOfSlice reduceOption sum
updated
combinations foldRight lastIndexWhere reduceRight tail
view
companion forall lastOption reduceRightOption tails
withFilter
compose foreach length repr take
zip
contains genericBuilder lengthCompare reverse
takeRight zipAll
containsSlice groupBy lift reverseIterator
takeWhile zipWithIndex
copyToArray grouped map reverseMap to
copyToBuffer hasDefiniteSize max runWith
toArray
corresponds hashCode maxBy sameElements
toBuffer
count head min scan
toIndexedSeq
scala> v(0)
res0: Int = 1
scala> Range(0,5)
res8: scala.collection.immutable.Range = Range(0, 1, 2, 3, 4)
scala> println(0 until 5)
Range(0, 1, 2, 3, 4)
scala> println(0 to 5)
Range(0, 1, 2, 3, 4, 5)
scala> ('0' to '9') ++ ('A' to 'Z')
res11: scala.collection.immutable.IndexedSeq[Char] = Vector(0, 1, 2, 3, 4, 5, 6,
7, 8, 9, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X,
Y, Z)
scala> 1 to 5 toList
warning: there was one feature warning; re-run with -feature for details
res12: List[Int] = List(1, 2, 3, 4, 5)
4.集(Set)和集的操作
l 集Set是不重复元素的集合
l 和列表不同,集并不保留元素插入的顺序。默认以Hash集实现
scala> var s1 = Set(1,2,3)
s1: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> s1 + 10
res13: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 10)
scala> s1 + 7
res14: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 7)
scala> s1
res15: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
可排序的Set
scala> var s2 = scala.collection.mutable.SortedSet(1,2,3,10,8,5)
s2: scala.collection.mutable.SortedSet[Int] = TreeSet(1, 2, 3, 5, 8, 10)
scala> s2.contains(1)
res16: Boolean = true
scala> s1
res17: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
判断一个集合是否是另一个集合的子集
scala> s1 subsetOf(s2)
res18: Boolean = true
集合的运算:union 并集、intersect 交集、diff 差集
scala> var set1 = Set(1,2,3,4,5,6)
set1: scala.collection.immutable.Set[Int] = Set(5, 1, 6, 2, 3, 4)
scala> var set2 = Set(5,6,7,8,9,10)
set2: scala.collection.immutable.Set[Int] = Set(5, 10, 6, 9, 7, 8)
scala> set1 union set2
res19: scala.collection.immutable.Set[Int] = Set(5, 10, 1, 6, 9, 2, 7, 3, 8,
4)
scala> set1 intersect set2
res20: scala.collection.immutable.Set[Int] = Set(5, 6)
scala> set1 diff set2
res21: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 4)
scala> set2 diff set1
res22: scala.collection.immutable.Set[Int] = Set(10, 9, 7, 8)
union
SQL :
select **** from ****
union
select ****
(*)参与运算的各个集合,必须要列数和列的类型一致
select A,B,C,D from ***
union
select A,B,C from ****
上面例子是错误的
select A from table1 group by A;
5.模式匹配
Scala有一个强大的模式匹配机制,可以应用在很多场合:
switch语句
类型检查
Scala还提供了样本类(case class),对模式匹配进行了优化
模式匹配示例:
更好的switch
/**
* 模式匹配
*/
object Demo1 {
def main(args: Array[String]): Unit = {
// 1 相当于switch case
var chi = '+'
var sign = 0 // 标识符 判断chi 如果 - 则赋值为 -1
chi match {
case '+' => sign = 1
case '-' => sign = -1
case _ => sign = 0 // _ 表示其他情况
}
println(sign)
//2、scala的守卫:匹配某种类型的所有值。 case _ if
//匹配所有的数字 ch2
var ch2 = '*'
var digit : Int = -1
ch2 match {
case '+' => println("这是一个加号")
case '-' => println("这是一个减号")
case _ if Character.isDigit(ch2) => digit = Character.digit(ch2,10)//10 代
表10进制
case _ => println("其他")
}
println(digit)
// 3 在模式匹配中 使用变量
var mystr = "Hello World"
mystr(7) match {
case '+' => println("这是一个加号")
case '-' => println("这是一个减号")
case ch123 => println(ch123)
}
// 4 instanceOf 匹配类型
/**
* scala中容易混淆的类型
* Any 表示任何类型,相当于Java中的object
* Unit 表示没有值 相当于void
* Nothing Nothing类型是Scala类层级中最低端的类型;他是任何其他类型的子类
* Null 是所有引用类型的子类,值 null
*
* Option:scala中的option表示一个值是可选的(有值或无值)
* Some : 如果值存在,Option 就是 Some
* None : 如果值不存在,Option 就是 None
*
* 四个N总结:None Nothing Null Nil
* None:some对立
* Nothing:抛出异常
* Null:引用类型子类 null
* Nil:一个空的List
*
*/
var v4 : Any = 1000
v4 match {
case x : Int => println("这是一个整数")
case s : String => println("这是一个字符串")
case _ => println("其他类型")
}
// 5 匹配数组和列表
var myArray = Array(1,2,3)
myArray match {
case Array(0) => println("数组中只有一个0")
case Array(x,y) => println("数组中包含两个元素")
case Array(x,y,z) => println("数组中包含三个元素")
case Array(x,_*) => println("这是数组,包含多个元素")
}
var myList = List(1,2,3)
myList match {
case List(0) => println("列表中只有一个0")
case List(x,y) => println("列表中包含两个元素,和是:" + (x+y))
case List(x,y,z) => println("列表中包含三个元素,和是:" + (x+y+z))
case List(x,_*) => println("这是列表,包含多个元素,和是:" + myList.sum)
}
}
}
6.样本类:case class
scala> class Student(sid:Int)
defined class Student
scala> case class Student(sid:Int)
defined class Student
作用:
(1)支持模式匹配:相当于java中的instanceof
(2)定义Spark SQL 的 schema
scala> class Fruit
defined class Fruit
scala> class Apple(name:String) extends Fruit
defined class Apple
scala> class Banana(name:String) extends Fruit
defined class Banana
scala> var a = new Apple("Apple")
a: Apple = Apple@8f911ab
scala> var b = new Banana("Banana")
b: Banana = Banana@90d007a
scala> println(a.isInstanceOf[Fruit])
true
scala> println(a.isInstanceOf[Banana])
<console>:16: warning: fruitless type test: a value of type Apple cannot also be
a Banana
println(a.isInstanceOf[Banana])
^
false
使用case class 结合 模式匹配来实现上述需求:
/**
*
* case class 和 模式匹配
*/
class Vehicle
case class Car(name:String) extends Vehicle
case class Bike(name:String) extends Vehicle
object Demo2 {
def main(args: Array[String]): Unit = {
var a : Vehicle = new Car("Car")
a match {
case Car(name) => println("汽车 " + name)
case Bike(name) => println("自行车 " + name)
case _ => println("其他")
}
}
}