Deep Learning CNN

CNN

  • Basics
    • Tensors
    • Data whose dimension ≥ 3
    • For image data, input tensor = width x height x numberofchannels
    • output tensor = width x height x depth
    • Convolution
    • Specific features (patterns) of an image is only in a local area of the whole image
    • Goal is to get a scanner (filter / kernel) that can detect those features, if there is a match wherever on the whole image, it reacts
  • Convolutional layer
    • Network structure
    • layers have neurons arranged in 3 dimensions: width, height, depth
    • depth slice:
      • for one unit of depth, the slice of it is called depth slice
      • all neurons in each depth slice use absolutely the same weight and bias
      • weights are structured in a matrix, which is called kernel
      • each slice has its own kernel
      • For example in p11:
      • 3×3 grid is all the weights being used in that 6×6 slice
      • the value in that 4×4 is called scaler, indicates the reaction level of the filter, the higher the more intense
    • Sliding / Convolving
    • The process of receiving 6×6 as input feature, slide that 3×3 filter, and get the 4×4 output feature
    • For all slices, they always look at the same region, and each slice use a different filter in convolving process
    • Layer depth, Stride and zero-padding
    • layer depth = # of filters
    • stride = how much a filter moves per step
      • stride = 1 -> move one pixel per step
    • 0-padding -> pad the image with 1 or 2 pixel-wide ring of 0
      • used when features are near the boundary
    • number of params see P20
    • that 10 thing is number of channels, for RGB it’s 3, etc
  • Pooling layer
    • wtf is this:
    • Subsampling the image to make the image smaller while retaining the object itelf
    • Bring down number of parameters and avoid overfitting
    • how to do this:
    • max pooling, see P22
      • move the window, only output the max value in that window
    • and other fucking pooling shits
  • When close to output payer, the tensor is flattened, and fed through FC layers to get the final output
  • In training process, still use that fucking gradient descent shit, only CONV and FC are trained because they have params
  • Transfer learning:
    • Use a pretrained model, remove the last FC layer, train a new FC layer and connect that layer
  • 29-33, 35-40 use cases
    • ECG classes
    • Medical image classification
    • Fish recognition
    • Image style transfer
    • Image sentiment analysis
    • Image recommender

TensorFlow

Tags:

Comments are closed

Latest Comments