Deep Learning FC

Image data

  • methods to represent image data:
    • RGB(r,g,b)
    • r,g,b → [0, 255]
    • White: (255,255,255)
    • Grayscale
  • Image size
    • Size = Width * Height * #Channels
  • Image datasets:
    • ImageNet
    • MNIST
    • CIFAR-10

Unsupervised: Image clustering

  • Basics:
    • Cluster image based on their “similarity”
    • Visual, distance, correlation, …
    • Extract features by whatever the fuck method and use k-means to cluster
  • Useful when labels are unknown and labeling is expensive
  • Can be subjective and inaccurate

Supervised: Deep Feedforward Ntwk

  • Basics:
    • AKA Multilayer perceptrons
    • Feedforward because it’s only one directioned
    • two direction shits are called feedback connections
    • Deep because there are many hidden layers
    • Depth: Number of hidden layers
    • Width: Number of nodes on ONE hidden layer
    • Why deep not fat: one layer can be overly fucking large to achieve the same thing with multiple layers, and other reasons
    • Weights & Bias: associated with each neurons
    • Activation function
    • $\sigma(z)$
    • Basically f(z) but we are using sigma for whatever the fucking reason
  • How exactly does this fucking thing work:
    • On each neuron, it takes all the $a$ from the previous layer
    • It then calculates a value $z$ by using all the taken fucking $a$sses
    • $z = a1w1 + a2w2 + … +akwk + b$
    • $w$ and $b$ is constantly being adjusted
    • $w$ is assigned to each edges
    • Use an activation function to convert $z$ into $a$ that is to be used by next layer
    • $z → \sigma(z) → a$
    • That $a$ is to be used by next layer

Model training

  • Parameter shits
    • Parametric: logistic classifier
    • Non-Parametric: KNN
    • Hyperparameters: predefined / fixed
    • Parameters: updated during model training
  • Loss function
    • How does the final result deviate form the actual result, the smaller the better
    • $\large{L{data} = \sum{i=1}^NL_i}$
    • For classification task, loss function can be defined by using Cross-entropy
    • Need to control overfitting using regularization methods
  • Gradient descent
    • how to find the best:
    • Calculate L
    • Update W in next step as:
      • $W{k+1} ← 𝑊k − \alpha∇_WL$
      • $b{k+1} ← bk − \beta∇_bL$
    • Repeat until gradient is small enough or k reached a limit
  • Number of parameters:
    • param number = input_shape x layer width (W) + layer width (b)
    • 1st hidden layer: input_shape = shape of input
    • later hidden layer & output layer : input_shape = layer width of prev layer
    • for output layer, layer width = output types

Tags:

Comments are closed

Latest Comments