3.1 Introduction
In last few years, intensive research happened on remote sensing scene classification focusing on different dataset. Local descriptors, in fact, like local binary patterns (LBP), Local Ternary Pattern (LTP), Completed Local Binary Pattern (CLBP), or histograms of oriented gradients (HOG) have proven their worth in different scene classification.
Deep learning has come with a revolutionary change in the field of machine learning. Accuracy of different datasets jumped after applying deep learning approach.
In this chapter, we will discuss some hand-engineered feature extraction approach that have used to efficiently classify remote sensing image scene. We will also describe some deep learning based approaches to classify
…show more content…
They found 97.10% accuracy using fine-tuned GoogleNet on UC Merced dataset and 91.83% accuracy using Googlenet on Brazilian coffee scene dataset. These accuracies are state of the art on these datasets.
Wang, Limin, et al. used [5] VGG networks to find state of the art accuracy on Place205, MIT67 and SUN397. They used Caffe toolbox to design ConvNets. They desirned three types of VGG nets named VGG11, VGG13, VGG16. Trained the VGG nets on Place205 dataset. To reduce computation cost they loaded weight to VGG13 from VGG11 and from VGG13 to VGG16. Using VGG nets they found state of the art accuracy on these three datasets compared to GoogleNet and AlexNet.
Krizhevsky, Alex [6] trained a deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, they achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. Their neural network has 60 million parameters and 650000 neurons consists of five convolutional layer. They achieved 15.3% top five error rate in ILSVRC-2012 which was the best result in the competition. ILSVRC is the worldwide competition of image classification. The dataset was taken from ImageNet dataset. They used
These 16 features included 12 features calculated based on the 6 multispectral bands, which is mean value and standard deviation of these bands. In addition, we chose intensity, texture-variance, texture-mean, and NDVI (Normalized Difference Vegetation Index) for classifications. Finally, training samples were selected for each classification category based on the previously segmented and merged objects
The training data contained both labeled data D_la={〖x_i,y_i}〗_(i=1)^kl and unlabeled data D_un= {〖x_j}〗_(j=kl+1)^(kl+u) where x_(i ) is the feature descriptor of image I and y_i={1,…,k} is its label .k is the number of categories. l is the number of labeled data in each category, and u is the number of unlabeled data. Our method aims to learn a high-level image representation S by exploiting the few labeled data D_land great quantities of unlabeled ones, which is then fed into different classifiers to obtain final classification results. The procedure of semisupervised feature learning by SSEP is shown in Fig. 1. First, a new sampling algorithm based on GNA [19] is proposed to produce T WT sets P^t={(〖s_i^t,c_i^t)}〗_(i=1)^kp , t ∈{1,…..,T}
In AI, the pace of recent research has been remarkable. Artificial systems now match human performance in challenging object recognition tasks (Krizhevsky et al., 2012) and outperform expert humans in dynamic,
Steve Lawrence and Andrew D. Back in 1997 referred to [1] proposed a convolutional neural network approach which includes the
Another direct encoding scheme to design CNN architectures is proposed by Masanori et al. [14]. They created new network architectures based on genetic programming. More specifically, a Cartesian Genetic Programming (CGP) encoding scheme that represents the network structure and its connectivity. The main advantage of the proposed method is the flexibility, exploring variable-length architectures and skipping connections resulted in non-standard CNN architectures. To evaluate each architecture evolved by the CGP, they used the validation dataset performance as fitness to the evolutionary programming method. They also used predefined structural blocks called highly functional modules. Using these modules, they simulate the behavior of
Thus a multi class classification of SVM is choosen here for the classification from Cody Neuburger [18]. In traditional SVM; the structure of trained SVM is formed in a 1×1 structure. And from that structure
it has been seen that algorithm can improve the memory bandwidth. here the accelerator can be used in hardware design to improve the memory efficiency. In recent year, the hardware architecture for deep learning employed the GPU to increase the speed. next it moved to FPGA and then latest one is ASIC with extra unit. the Application-specific integrated circuit (AISC) is integrated circuit which is used to developing hardware to solve a problem by building gates to emulate the logic. The purpose of the chips to provide maximum performance at given power and cost budget. ASIC provide some sort of software engine to run deep learning algorithm to optimize performance power and memory. the different of componenet of hardware accelarator is
Stochastic Gradient Descent (SGD) in mini batches (details provided in Sec. 5.3.3) is used to train fast R-CNN by sampling $I$ images and $R/I$ RoIs from each image. As noted by the authors, a small $I$ helps in decreasing the mini-batch computational complexity. For instance, if $I$ = 2 and $R$ = 128, the training strategy has been found to be 64 times faster than sampling one RoI from 128 images. In addition, the training process of the Fast R-CNN is further streamlined with one fine-tuning stage that not only optimizes the softmax classifier but also the bounding-box regressors, and the SVMs \citep{he2014spatial}.
ELM is used as supervised learning method for SLFN method. ELM has high accuracy and fast prediction speed while solving numerous real-life problems [2], [6]. ELM randomly selects the input weights and hidden layer biases instead of fully tuning all the internal parameters such as gradient-based algorithms. ELM could analytically determine the output weights [2].
a text input or image to generate a corresponding narration. In world where internet user throws
As the dimensionality of the Attribute space increases, many types of data analysis and classification also become significantly harder, and, additionally, the data becomes increasingly sparse in the space it occupies which can lead to big difficulties for both supervised and unsupervised learning.
Classification between the objects is easy task for humans but it has proved to be a complex problem for machines. The raise of high-capacity computers, the availability of high quality and low-priced video cameras, and the increasing need for automatic video analysis has generated an interest in object classification algorithms. A simple classification system consists of a camera fixed high above the interested zone, where images are captured and consequently processed. Classification includes image sensors, image preprocessing, object detection, object segmentation, feature extraction and object classification. Classification system consists of database that contains predefined patterns that compares with detected object to classify in to proper category. Image classification is an important and challenging task in various application domains, including biomedical imaging, biometry, video surveillance, vehicle navigation, industrial visual inspection, robot navigation, and remote sensing.
The solutions provided in this research work are more compatible for retrieving images from natural and photographic image databases and use an amalgamation of image processing and machine learning algorithms to perform retrieval in a fast manner while improving both the fraction of retrieved images that are relevant to the find and fraction of the images that are relevant to the query image that are successfully retrieved.
Using right algorithm, can make image sensor sense or detect practically anything. Image sensors are one of the important sensors been used in robotics industry because they are so flexible, but there are two drawbacks with these kind of sensors: 1)they output lots of data, dozens of megabytes per second, and 2) processing this amount of data can overwhelm many processors. And even if the processor can keep up with the data, much of its processing power won’t be available for other tasks.