Referrence
https://github.com/kodecocodes/swift-algorithm-club
Array
Array is also a RandomAccessCollection which makes guarantees about efficiency.
This means that no matter how big your array gets, computing count will always take the same amount of time.
Other data structures such as linked lists and trees do not have constant time access.
The most efficient scenario for adding an element to an array is to append it at the end of the array
Inserting new elements from anywhere aside from the end of the array will force elements to shuffle backwards to make room for the new element
If inserting elements in front of a collection is a common operation for your program, you may want to consider a different data structure to hold your data.
The second factor that determines the speed of insertion is the array’s capacity.
However, the standard library employs a subtle trick to prevent appends from being O(n) time. Each time it runs out of storage and needs to copy, it doubles the capacity.
Dictionary
Dictionaries don’t have any guarantees of order, nor can you insert at a specific index.
Inserting into a dictionary is always O(1). Lookup operations are also done in O(1) time, which is significantly faster than the O(n) lookup time for a particular element in the array.
Surprisingly, many of these other data structures can be built efficiently using these two simple standard library primitives.
Linked list
A linked list is a collection of values arranged in a linear unidirectional sequence.
A linked list has several theoretical advantages over contiguous storage options such as the Swift Array:
- Constant time insertion and removal from the front of the list.
- Reliable performance characteristics.
As the diagram suggests, a linked list is a chain of nodes
. Nodes have two responsibilities:
- Hold a value.
- Hold a reference to the next node. A nil value represents the end of the list.
class Node<V> {
var value: V
var next: Node?
init(value: V, next: Node? = nil) {
self.value = value
self.next = next
}
}
extension Node: CustomStringConvertible {
var description: String {
guard let next = next else { return "\(value)" }
return "\(value) -> \(next)"
}
}
let n1 = Node(value: 1)
let n2 = Node(value: 2)
let n3 = Node(value: 3)
n1.next = n2
n2.next = n3
print(n1) // 1 -> 2 -> 3
You can easily see that building long lists this way is impractical. A common way to alleviate this problem is to build a LinkedList that manages the Node objects
LinkedList
A linked list has the concept of a head and tail, which refers to the first and last nodes of the list respectively.
There are three ways to add values to a linked list, each having their own unique performance characteristics:
- push: Adds a value at the front of the list.
- append: Adds a value at the end of the list.
- insert(after:): Adds a value after a particular node of the list.
struct LinkedList<V> {
var head: Node<V>?
var tail: Node<V>?
init(){}
var isEmpty: Bool { head == nil }
}
push O(1)
Adding a value at the front of the list is known as a push operation. This is also known as head-first insertion.
mutating func push(value: V) {
head = Node(value: value, next: head)
if tail == nil {
tail = head
}
}
append O(1)
The next operation you’ll look at is append. This is meant to add a value at the end of the list, and is known as tail-end insertion.
mutating func append(value: V) {
guard !isEmpty else {
return push(value: value)
}
let newNode = Node(value: value)
tail?.next = newNode
tail = newNode
}
length
var length: Int {
var current = head
var count = 0
while current != nil {
count += 1
current = current?.next
}
return count
}
node(at:) O(i)
func node(at index: Int) -> Node<V>? {
var current = head
var currentIdx = 0
while current != nil, currentIdx < index {
current = current?.next
currentIdx += 1
}
return current
}
firstNode(withValue:)
lastNode(withValue:)
func firstNode(withValue value: V) -> Node<V>? {
var current = head
while current != nil {
if current?.value == value {
break
}
current = current?.next
}
return current
}
func lastNode(withValue value: V) -> Node<V>? {
var current = head
var found: Node<V>?
while current != nil {
if current?.value == value {
found = current
}
current = current?.next
}
return found
}
insert(after:) O(i)
@discardableResult
mutating func insert(value: V, after node: Node<V>) -> Node<V>? {
guard tail !== node else {
append(value: value)
return tail
}
node.next = Node(value: value, next: node.next)
return node.next
}
By moving the head down a node, you’ve effectively removed the first node of the list. ARC will remove the old node from memory once the method finishes, since there will be no more references attached to it.
Removals
@discardableResult
mutating func dropFirst() -> V? {
defer {
head = head?.next
if isEmpty {
tail = nil
}
}
return head?.value
}
@discardableResult
mutating func dropLast() -> V? {
guard !isEmpty else {
return nil
}
if head === tail {
return dropFirst()
}
var prev = head
var current = head
while let next = current.next {
prev = current
current = next
}
prev.next = nil
tail = prev
/*
guard let lastPrevious = node(at: length - 2) else {
return nil
}
lastPrevious.next = nil
tail = lastPrevious
*/
return tail?.value
}
@discardableResult
mutating func remove(at index: Int) -> V? {
guard let nodeCurrent = node(at: index) else {
return nil
}
if nodeCurrent === head {
return dropFirst()
}
else if nodeCurrent === tail {
return dropLast()
}
guard let nodePrevious = node(at: index - 1),
let nodeAfter = node(at: index + 1) else {
return nil
}
nodePrevious.next = nodeAfter
return nodeCurrent.value
}
A linked list can earn two qualifications from the Swift collection protocols. First, since a linked list is a chain of nodes, adopting the Sequence protocol makes sense. Second, since the chain of nodes is a finite sequence, it makes sense to adopt the Collection protocol.
Swift Collection
A collection type is a finite sequence and provides nondestructive sequential access.
The linked list cannot achieve O(1) subscript operations using integer indexes. Thus, your goal is to define a custom index that contains a reference to its respective node.
/// The `..<` operator
/// creates a range that doesn't include the upper bound, so it's always
/// safe to use with `endIndex`. For example:
let numbers = [10, 20, 30, 40, 50]
if let index = numbers.firstIndex(of: 30) {
print(numbers[index ..< numbers.endIndex])
}
Example:
var list = LinkedList<Int>()
for i in 0...9 {
list.append(value: i)
}
print("first: \(list[list.startIndex])") // 0
print("last: \(list[list.index(list.startIndex, offsetBy: list.count-1)])") // 9
print("first 3: \(Array(list.prefix(3)))")
print("last 1: \(Array(list.suffix(1)))")
COW Copy-On-Write
let arr1 = [1, 2]
var arr2 = arr1
arr2.append(3) // only arr2 was updated
Unfortunately, your linked list does not have value semantics! This is because your underlying storage uses a reference type (Node).
The strategy to achieve value semantics with COW is fairly straightforward. Before mutating the contents of the linked list, you want to perform a copy of the underlying storage and update all references (head and tail) to the new copy.
// copy-on-write: recreate all nodes when mutating
private mutating func _COW() {
guard !isKnownUniquelyReferenced(&head) else {
return
}
guard var oldHead = head else {
return
}
head = Node(value: oldHead.value)
var newTail = head
while let next = oldHead.next {
newTail.next = .init(value: next.value)
newTail = newTail.next
oldHead = next
}
tail = newTail
}
In the Swift Standard Library lives a function named
isKnownUniquelyReferenced
. This function can be used to determine whether or not an object has exactly one reference to itThe unidirectional nature of the linked list means that
head first insertions
can ignore the “COW tax”!
Stack
LIFO (last in first out)
When you add an item to a stack, you place it on top of the stack. When you remove an item from a stack, you always remove the topmost item.
There are only two essential operations for a stack:
- push - adding an element to the top of the stack
- pop - removing the top element of the stack
This means you can only add or remove elements from one side of the data structure.
Choosing the right storage type for your stack is important. The array is an obvious choice, since it offers constant time insertions and deletions at one end via
append
andpopLast
.
struct Stack<Element> {
var storage: [Element] = []
mutating func push(element: Element) {
storage.append(element)
}
@discardableResult
mutating func pop() -> Element? {
guard !isEmpty else {
return nil
}
return storage.popLast()
}
func peek() -> Element? {
storage.last
}
var isEmpty: Bool {
peek() == nil
}
}
extension Stack: CustomStringConvertible {
var description: String {
return "------ top ------\n" +
storage.map{ "\($0)" }.reversed().joined(separator: "\n") +
"\n-----------------"
}
}
var stack = Stack<Int>()
stack.push(element: 1)
stack.push(element: 2)
stack.push(element: 3)
print(stack)
stack.pop()
print(stack)
print(stack.peek())
push and pop both have a O(1) time complexity.The idea of peek is to look at the top element of the stack without mutating its contents.
A stack's purpose is to
limit the number of ways to access your data
, and adopting protocols such as Collection would go against this goal by exposing all the elements via iterators and the subscript. In this case, less is more!
Stacks are crucial to problems that search trees and graphs. Imagine finding your way through a maze. Each time you come to a decision point of left right or straight you can push all possible decisions onto your stack. When you hit a dead end, simply backtrack by popping from the stack and continuing until you escape or hit another dead end.
Queue
FIFO(first-in-first-out)
Queues are handy when you need to maintain the order of your elements to process later.
- enqueue: Insert an element at the back of the queue. Returns true if the operation was successful.
- dequeue: Remove the element at the front of the queue and return it.
- isEmpty: Check if the queue is empty.
- peek: Return the element at the front of the queue without removing it.
In the following sections, you will learn to create a queue in four
different ways:
- Using an array
- Using a doubly linked list
- Using a ring buffer
- Using two stacks
Array
Here you’ve defined a generic QueueArray struct that adopts the Queue protocol. Note that the associated type Element is inferred by the type parameter T.
Regardless of the size of the array, enqueueing an element is an O(1) operation.
After adding multiple elements, the array will eventually be full. When you want to use more than the allocated space, the array must resize to make additional room.
Resizing is an O(n) operation. Resizing requires the array to allocate new memory and copy all existing data over to the new array.
Removing an element from the front of the queue is an O(n) operation. To dequeue, you remove the element from the beginning of the array. This is always a linear time operation, because it requires all the remaining elements in the array to be shifted in memory.
protocol Queue {
associatedtype Element
mutating func enqueue(element: Element) -> Bool
mutating func dequeue() -> Element?
var isEmpty: Bool {get}
func peek() -> Element?
}
struct ArrayQueue<T>: Queue {
private var storage: [T] = []
@discardableResult
mutating func enqueue(element: T) -> Bool {
storage.append(element)
return true
}
@discardableResult
mutating func dequeue() -> T? {
guard !isEmpty else {
return nil
}
return storage.removeFirst()
}
func peek() -> T? {
storage.first
}
var isEmpty: Bool {
storage.isEmpty
}
}
extension ArrayQueue: CustomStringConvertible {
var description: String {
storage.description
}
}
var q = ArrayQueue<Int>()
q.enqueue(element: 1)
q.enqueue(element: 2)
q.enqueue(element: 3)
print(q)
q.dequeue()
print(q)
print(q.peek())
There are some shortcomings to the implementation. Removing an item from the front of the queue can be inefficient, as removal causes all elements to shift up by one. This makes a difference for very large queues. Once the array gets full, it has to resize and may have unused space. This could increase your memory footprint over time.
Doubly linked list
A doubly linked list is simply a linked list in which nodes also contain a reference to the previous node.
class Node<T> {
var previous: Node<T>?
var next: Node<T>?
var value: T
init(previous: Node<T>? = nil, next: Node<T>? = nil, value: T) {
self.previous = previous
self.next = next
self.value = value
}
}
extension Node: CustomStringConvertible {
var description: String {
return "\(value)"
}
}
class DoublyLinkedList<T> {
private var head: Node<T>?
private var tail: Node<T>?
init(){}
var isEmpty: Bool { head == nil }
var first: Node<T>? { head }
var last: Node<T>? { tail }
func append(value: T) {
let newNode = Node(value: value)
guard let tailNode = tail else {
head = newNode
tail = head
return
}
newNode.previous = tailNode
tailNode.next = newNode
tail = newNode
}
@discardableResult
func remove(node: Node<T>) -> T? {
let nodePrevious = node.previous
let nodeNext = node.next
// is removing the first one
if let previous = nodePrevious {
previous.next = nodeNext
} else {
head = nodeNext
}
// is removing the last one
if let next = nodeNext {
next.previous = nodePrevious
} else {
tail = nodePrevious
}
node.previous = nil
node.next = nil
return node.value
}
}
extension DoublyLinkedList: CustomStringConvertible {
var description: String {
guard !isEmpty else {
return "Empty List"
}
var current = head
var stringNodes = Array<String>()
while let node = current {
stringNodes.append("\(node.value)")
current = node.next
}
return stringNodes.joined(separator: " <-> ")
}
}
Queue with a doubly linked list:
struct LinkedListQueue<T>: Queue {
private var storage = DoublyLinkedList<T>()
@discardableResult
mutating func enqueue(element: T) -> Bool {
storage.append(value: element) // insert at 0
return true
}
@discardableResult
mutating func dequeue() -> T? { // remove last
guard let first = storage.first else {
return nil
}
return storage.remove(node: first)
}
func peek() -> T? {
storage.first?.value
}
var isEmpty: Bool {
storage.isEmpty
}
}
extension LinkedListQueue: CustomStringConvertible {
var description: String {
storage.description
}
}
var q = LinkedListQueue<Int>()
q.enqueue(element: 1)
q.enqueue(element: 2)
q.enqueue(element: 3)
print(q) // (head)1 <-> 2 <-> 3(tail)
q.dequeue()
print(q) // 2 <-> 3
print(q.peek()) // Optional(2)
- Enqueue: O(1)
- Dequeue: O(1)
Compared to the array implementation, you didn’t have to shift elements one by one. Instead, in the diagram above, you simply update the next and previous pointers.
Each element has to have extra storage for the forward and back reference. Moreover, every time you create a new element, it requires a relatively expensive dynamic allocation. By contrast QueueArray does bulk allocation which is faster.
Ring buffer
Adding new items to the back of the queue is fast, O(1), but removing items from the front of the queue is slow, O(n).Removing is slow because it requires the remaining array elements to be shifted in memory.
You can use a queue based on a ring buffer to keep track of whose turn is coming up next.
A ring buffer, also known as a circular buffer, is a fixed-size array.
The ring buffer has two pointers that keep track of two things:
- The read pointer keeps track of the front of the queue.
- The write pointer keeps track of the next available slot so you can override existing elements that have already been read.
Each time you add an item to the queue, the write pointer increments by one.
Dequeuing is the equivalent of reading a ring buffer.
Since the write pointer reached the end, it simply wraps around to the starting index again.
The read pointer wraps to the beginning as well.
As final observation, notice that whenever the read and write pointers are at the same index, that means the queue is empty.
But hold up, you say, how does this wrapping around work? There are several ways to accomplish this and I chose a slightly controversial one. In this implementation, the writeIndex and readIndex always increment and never actually wrap around.
- The reason this is a bit controversial is that writeIndex and readIndex always increment, so in theory these values could become too large to fit into an integer and the app will crash. However, a quick back-of-the-napkin calculation should take away those fears.
- Both writeIndex and readIndex are 64-bit integers. If we assume that they are incremented 1000 times per second, which is a lot, then doing this continuously for one year requires ceil(log_2(365 * 24 * 60 * 60 * 1000)) = 35 bits. That leaves 28 bits to spare, so that should give you about 2^28 years before running into problems. That's a long time. :-)
- You've seen that a ring buffer can make a more optimal queue but it also has a disadvantage: the wrapping makes it tricky to resize the queue. But if a fixed-size queue is suitable for your purposes, then you're golden.
A ring buffer is also very useful for when a producer of data writes into the array at a different rate than the consumer of the data reads it. This happens often with file or network I/O. Ring buffers are also the preferred way of communicating between high priority threads (such as an audio rendering callback) and other, slower, parts of the system.
The implementation given here is not thread-safe. It only serves as an example of how a ring buffer works. That said, it should be fairly straightforward to make it thread-safe for a single reader and single writer by using OSAtomicIncrement64()
to change the read and write pointers.
RingBuffer
class RingBuffer<T> {
private var writeIndex = 0
private var readIndex = 0
private var storage = Array<T?>()
init(count: Int) {
storage = Array<T?>(repeating: nil, count: count)
}
var first: T? {
guard !isEmpty else {
return nil
}
return storage[readIndex]
}
@discardableResult
func write(_ element: T) -> Bool {
print("write index: \(writeIndex)")
if !isFull {
storage[writeIndex % storage.count] = element
writeIndex += 1
return true
}
else {
return false
}
}
func read() -> T? {
print("read index: \(readIndex)")
if !isEmpty {
let head = storage[readIndex % storage.count]
readIndex += 1
return head
}
else {
return nil
}
}
var isEmpty: Bool {
availableSpaceForReading == 0
}
private var availableSpaceForReading: Int {
writeIndex - readIndex
}
private var availableSpaceForWriting: Int {
storage.count - availableSpaceForReading
}
private var isFull: Bool {
availableSpaceForWriting == 0
}
}
extension RingBuffer: CustomStringConvertible {
public var description: String {
print(storage)
let values = (0..<availableSpaceForReading).map {
String(describing: storage[($0 + readIndex) % storage.count]!)
}
return "[" + values.joined(separator: ", ") + "]"
}
}
RingBufferQueue
struct RingBufferQueue<T>: Queue {
private var storage: RingBuffer<T>
init(count: Int) {
self.storage = .init(count: count)
}
@discardableResult
mutating func enqueue(element: T) -> Bool {
return storage.write(element)
}
@discardableResult
mutating func dequeue() -> T? {
return storage.read()
}
func peek() -> T? {
storage.first
}
var isEmpty: Bool {
storage.isEmpty
}
}
extension RingBufferQueue: CustomStringConvertible {
var description: String {
storage.description
}
}
var q = RingBufferQueue<Int>(count: 3)
q.enqueue(element: 1) // true
q.enqueue(element: 2) // true
q.enqueue(element: 3) // true
q.enqueue(element: 4) // false
print(q) // [1, 2, 3]
q.dequeue() // 1
print(q) // [2, 3]
print(q.peek()) // 2
The ring buffer has a fixed size which means that enqueue can fail.
Double stack
The idea behind using two stacks is simple. Whenever you enqueue an element, it goes in the right stack. When you need to dequeue an
element, you reverse the right stack and place it in the left stack so you can retrieve the elements using FIFO order.
struct Stack<Element> {
private var storage: [Element] = []
var top: Element? {
peek()
}
var bottom: Element? {
storage.first
}
mutating func push(element: Element) {
storage.append(element)
}
@discardableResult
mutating func pop() -> Element? {
guard !isEmpty else {
return nil
}
return storage.popLast()
}
func peek() -> Element? {
storage.last
}
var isEmpty: Bool {
peek() == nil
}
}
extension Stack {
mutating func reversed() -> Self {
.init(storage: storage.reversed())
}
mutating func removeAll() {
storage.removeAll()
}
var elements: [Element] {
storage.map{ $0 }
}
}
extension Stack: CustomStringConvertible {
var description: String {
return "------ top ------\n" +
storage.map{ "\($0)" }.reversed().joined(separator: "\n") +
"\n-----------------"
}
}
StacksQueue
public struct StacksQueue<T> : Queue {
private var leftStack: [T] = []
private var rightStack: [T] = []
public init() {}
}
extension StacksQueue: CustomStringConvertible {
var description: String {
print("left stack: \(leftStack), right stack: \(rightStack)")
let all = (leftStack.elements + rightStack.elements).map{ "\($0)" }
return "queue: [\(all.joined(separator: ", "))]"
}
}
var q = StacksQueue<Int>()
q.enqueue(element: 1)
q.enqueue(element: 2)
q.enqueue(element: 3)
print(q)
q.dequeue()
print(q)
print(q.peek())
Remember, you only transfer the elements in the right stack when the left stack is empty!
Moreover, your two stack implementation is fully dynamic and doesn’t have the fixed size restriction that your ring-buffer-based queue implementation has.
Finally, it beats the linked list in terms of spacial locality. This is because array elements are next to each other in memory blocks. So a large number of elements will be loaded in a cache on first access.
Compare this to a linked list where the elements aren’t in contiguous blocks of memory. This could lead to more cache misses which will increase access time.