JeVoisBase  1.21
JeVois Smart Embedded Machine Vision Toolkit Base Modules
Share this page:
Loading...
Searching...
No Matches
DarknetSingle.C
Go to the documentation of this file.
1// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2//
3// JeVois Smart Embedded Machine Vision Toolkit - Copyright (C) 2016 by Laurent Itti, the University of Southern
4// California (USC), and iLab at USC. See http://iLab.usc.edu and http://jevois.org for information about this project.
5//
6// This file is part of the JeVois Smart Embedded Machine Vision Toolkit. This program is free software; you can
7// redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
8// Foundation, version 2. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
9// without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
10// License for more details. You should have received a copy of the GNU General Public License along with this program;
11// if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
12//
13// Contact information: Laurent Itti - 3641 Watt Way, HNB-07A - Los Angeles, CA 90089-2520 - USA.
14// Tel: +1 213 740 3527 - itti@pollux.usc.edu - http://iLab.usc.edu - http://jevois.org
15// ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
16/*! \file */
17
18#include <jevois/Core/Module.H>
19#include <jevois/Debug/Timer.H>
21#include <opencv2/core/core.hpp>
22#include <opencv2/imgproc/imgproc.hpp>
24
25// icon from https://pjreddie.com/darknet/
26
27//! Identify objects using Darknet deep neural network
28/*! Darknet is a popular neural network framework. This module identifies the object in a square region in the center
29 of the camera field of view using a deep convolutional neural network.
30
31 The deep network analyzes the image by filtering it using many different filter kernels, and several stacked passes
32 (network layers). This essentially amounts to detecting the presence of both simple and complex parts of known
33 objects in the image (e.g., from detecting edges in lower layers of the network to detecting car wheels or even
34 whole cars in higher layers). The last layer of the network is reduced to a vector with one entry per known kind of
35 object (object class). This module returns the class names of the top scoring candidates in the output vector, if
36 any have scored above a minimum confidence threshold. When nothing is recognized with sufficiently high confidence,
37 there is no output.
38
39 Darknet is a great alternative to popular neural network frameworks like Caffe, TensorFlow, MxNet, pyTorch, Theano,
40 etc as it features: 1) small footprint which is great for small embedded systems; 2) hardware acceleration using ARM
41 NEON instructions; 3) support for large GPUs when compiled on expensive servers, which is useful to train the
42 neural networks on big servers, then copying the trained weights directly to JeVois for use with live video.
43
44 See https://pjreddie.com/darknet for more details about darknet.
45
46 \youtube{d5CfljT5kec}
47
48 This module runs a Darknet network and shows the top-scoring results. The network is currently a bit slow, hence it
49 is only run once in a while. Point your camera towards some interesting object, make the object fit in the picture
50 shown at right (which will be fed to the neural network), keep it stable, and wait for Darknet to tell you what it
51 found. The framerate figures shown at the bottom left of the display reflect the speed at which each new video frame
52 from the camera is processed, but in this module this just amounts to converting the image to RGB, sending it to the
53 neural network for processing in a separate thread, and creating the demo display. Actual network inference speed
54 (time taken to compute the predictions on one image) is shown at the bottom right. See below for how to trade-off
55 speed and accuracy.
56
57 Note that by default this module runs the Imagenet1k tiny Darknet (it can also run the slightly slower but a bit
58 more accurate Darknet Reference network; see parameters). There are 1000 different kinds of objects (object classes)
59 that these networks can recognize (too long to list here). The input layer of these two networks is 224x224 pixels
60 by default. This modules takes a crop at the center of the video image, with size determined by the network input
61 size. With the default network parameters, this module hence requires at least 320x240 camera sensor resolution. The
62 networks provided on the JeVois microSD image have been trained on large clusters of GPUs, typically using 1.2
63 million training images from the ImageNet dataset.
64
65 Sometimes this module will make mistakes! The performance of darknet-tiny is about 58.7% correct (mean average
66 precision) on the test set, and Darknet Reference is about 61.1% correct on the test set, using the default 224x224
67 network input layer size.
68
69 Neural network size and speed
70 -----------------------------
71
72 When using a video mapping with USB output, the network is automatically resized to a square size that is the
73 difference between the USB output video width and the camera sensor input width (e.g., when USB video mode is
74 544x240 and camera sensor mode is 320x240, the network will be resized to 224x224 since 224=544-320).
75
76 The network size direcly affects both speed and accuracy. Larger networks run slower but are more accurate.
77
78 For example:
79
80 - with USB output 544x240 (network size 224x224), this module runs at about 450ms/prediction.
81 - with USB output 448x240 (network size 128x128), this module runs at about 180ms/prediction.
82
83 When using a videomapping with no USB output, the network is not resized (since we would not know what to resize it
84 to). You can still change its native size by changing the network's config file, for example, change the width and
85 height fields in <b>JEVOIS:/share/darknet/single/cfg/tiny.cfg</b>.
86
87 Note that network dims must always be such that they fit inside the camera input image.
88
89 Serial messages
90 ---------------
91
92 When detections are found with confidence scores above \p thresh, a message containing up to \p top category:score
93 pairs will be sent per video frame. Exact message format depends on the current \p serstyle setting and is described
94 in \ref UserSerialStyle. For example, when \p serstyle is \b Detail, this module sends:
95
96 \verbatim
97 DO category:score category:score ... category:score
98 \endverbatim
99
100 where \a category is a category name (from \p namefile) and \a score is the confidence score from 0.0 to 100.0 that
101 this category was recognized. The pairs are in order of decreasing score.
102
103 See \ref UserSerialStyle for more on standardized serial messages, and \ref coordhelpers for more info on
104 standardized coordinates.
105
106 @author Laurent Itti
107
108 @displayname Darknet Single
109 @videomapping NONE 0 0 0.0 YUYV 320 240 2.1 JeVois DarknetSingle
110 @videomapping YUYV 544 240 15.0 YUYV 320 240 15.0 JeVois DarknetSingle
111 @videomapping YUYV 448 240 15.0 YUYV 320 240 15.0 JeVois DarknetSingle
112 @email itti\@usc.edu
113 @address University of Southern California, HNB-07A, 3641 Watt Way, Los Angeles, CA 90089-2520, USA
114 @copyright Copyright (C) 2017 by Laurent Itti, iLab and the University of Southern California
115 @mainurl http://jevois.org
116 @supporturl http://jevois.org/doc
117 @otherurl http://iLab.usc.edu
118 @license GPL v3
119 @distribution Unrestricted
120 @restrictions None
121 \ingroup modules */
123{
124 public:
125 // ####################################################################################################
126 //! Constructor
127 // ####################################################################################################
128 DarknetSingle(std::string const & instance) : jevois::StdModule(instance)
129 {
130 itsDarknet = addSubComponent<Darknet>("darknet");
131 }
132
133 // ####################################################################################################
134 //! Virtual destructor for safe inheritance
135 // ####################################################################################################
137 { }
138
139 // ####################################################################################################
140 //! Un-initialization
141 // ####################################################################################################
142 virtual void postUninit() override
143 {
144 try { itsPredictFut.get(); } catch (...) { }
145 }
146
147 // ####################################################################################################
148 //! Processing function, no video output
149 // ####################################################################################################
150 virtual void process(jevois::InputFrame && inframe) override
151 {
152 // Wait for next available camera image:
153 jevois::RawImage const inimg = inframe.get();
154 int const w = inimg.width, h = inimg.height;
155
156 // Check input vs network dims, will throw if network not ready:
157 int netw, neth, netc;
158 try { itsDarknet->getInDims(netw, neth, netc); }
159 catch (std::logic_error const & e) { inframe.done(); return; }
160
161 if (netw > w) netw = w;
162 if (neth > h) neth = h;
163
164 // Take a central crop of the input:
165 int const offx = ((w - netw) / 2) & (~1);
166 int const offy = ((h - neth) / 2) & (~1);
167
168 cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
169 cv::Mat crop = cvimg(cv::Rect(offx, offy, netw, neth));
170
171 // Convert crop to RGB for predictions:
172 cv::cvtColor(crop, itsCvImg, cv::COLOR_YUV2RGB_YUYV);
173
174 // Let camera know we are done processing the input image:
175 inframe.done();
176
177 // Launch the predictions (do not catch exceptions, we already tested for network ready in this block):
178 float const ptime = itsDarknet->predict(itsCvImg, itsResults);
179 LINFO("Predicted in " << ptime << "ms");
180
181 // Send serial results:
183 }
184
185 // ####################################################################################################
186 //! Processing function with video output to USB
187 // ####################################################################################################
188 virtual void process(jevois::InputFrame && inframe, jevois::OutputFrame && outframe) override
189 {
190 static jevois::Timer timer("processing", 30, LOG_DEBUG);
191
192 // Wait for next available camera image:
193 jevois::RawImage const inimg = inframe.get();
194
195 timer.start();
196
197 // We only handle one specific pixel format, but any image size in this module:
198 int const w = inimg.width, h = inimg.height;
199 inimg.require("input", w, h, V4L2_PIX_FMT_YUYV);
200
201 // While we process it, start a thread to wait for out frame and paste the input into it:
202 jevois::RawImage outimg;
203 auto paste_fut = jevois::async([&]() {
204 outimg = outframe.get();
205 outimg.require("output", outimg.width, outimg.height, V4L2_PIX_FMT_YUYV);
206
207 // Paste the current input image:
208 jevois::rawimage::paste(inimg, outimg, 0, 0);
209 jevois::rawimage::writeText(outimg, "JeVois Darknet Single - input", 3, 3, jevois::yuyv::White);
210
211 // Paste the latest prediction results, if any, otherwise a wait message:
212 cv::Mat outimgcv = jevois::rawimage::cvImage(outimg);
213 if (itsRawPrevOutputCv.empty() == false)
214 itsRawPrevOutputCv.copyTo(outimgcv(cv::Rect(w, 0, itsRawPrevOutputCv.cols, itsRawPrevOutputCv.rows)));
215 else
216 {
218 jevois::rawimage::writeText(outimg, "Loading network -", w + 3, 3, jevois::yuyv::White);
219 jevois::rawimage::writeText(outimg, "please wait...", w + 3, 15, jevois::yuyv::White);
220 }
221 });
222
223 // Decide on what to do based on itsPredictFut: if it is valid, we are still predicting, so check whether we are
224 // done and if so draw the results. Otherwise, start predicting using the current input frame:
225 if (itsPredictFut.valid())
226 {
227 // Are we finished predicting?
228 if (itsPredictFut.wait_for(std::chrono::milliseconds(5)) == std::future_status::ready)
229 {
230 // Do a get() on our future to free up the async thread and get any exception it might have thrown. In
231 // particular, it will throw a logic_error if we are still loading the network:
232 bool success = true; float ptime = 0.0F;
233 try { ptime = itsPredictFut.get(); } catch (std::logic_error const & e) { success = false; }
234
235 // Wait for paste to finish up and let camera know we are done processing the input image:
236 paste_fut.get(); inframe.done();
237
238 if (success)
239 {
240 int const netw = itsRawInputCv.cols, neth = itsRawInputCv.rows;
241 cv::Mat outimgcv = jevois::rawimage::cvImage(outimg);
242
243 // Update our output image: First paste the image we have been making predictions on:
244 itsRawInputCv.copyTo(outimgcv(cv::Rect(w, 0, netw, neth)));
245 jevois::rawimage::drawFilledRect(outimg, w, neth, netw, h - neth, jevois::yuyv::Black);
246
247 // Then draw the detections: either below the detection crop if there is room, or on top of it if not enough
248 // room below:
249 int y = neth + 3; if (y + int(itsDarknet->top::get()) * 12 > h - 21) y = 3;
250
251 for (auto const & p : itsResults)
252 {
253 jevois::rawimage::writeText(outimg, jevois::sformat("%s: %.2F", p.category.c_str(), p.score),
254 w + 3, y, jevois::yuyv::White);
255 y += 12;
256 }
257
258 // Send serial results:
260
261 // Draw some text messages:
262 jevois::rawimage::writeText(outimg, "Predict time: " + std::to_string(int(ptime)) + "ms",
263 w + 3, h - 11, jevois::yuyv::White);
264
265 // Finally make a copy of these new results so we can display them again while we wait for the next round:
266 itsRawPrevOutputCv = cv::Mat(h, netw, CV_8UC2);
267 outimgcv(cv::Rect(w, 0, netw, h)).copyTo(itsRawPrevOutputCv);
268
269 } else { itsRawPrevOutputCv.release(); } // network is not ready yet
270 }
271 else
272 {
273 // Future is not ready, do nothing except drawings on this frame (done in paste_fut thread) and we will try
274 // again on the next one...
275 paste_fut.get(); inframe.done();
276 }
277 }
278 else // We are not predicting: start new predictions
279 {
280 // Wait for paste to finish up:
281 paste_fut.get();
282
283 // In this module, we use square crops for the network, with size given by USB width - camera width:
284 if (outimg.width < inimg.width) LFATAL("USB output image must be larger than camera input");
285 int const netw = outimg.width - inimg.width;
286 int const neth = netw; // square crop
287
288 // Check input vs network dims:
289 if (netw > w || neth > h) LFATAL("Network input window must fit within camera frame");
290
291 // Take a central crop of the input:
292 int const offx = ((w - netw) / 2) & (~1);
293 int const offy = ((h - neth) / 2) & (~1);
294 cv::Mat cvimg = jevois::rawimage::cvImage(inimg);
295 cv::Mat crop = cvimg(cv::Rect(offx, offy, netw, neth));
296
297 // Convert crop to RGB for predictions:
298 cv::cvtColor(crop, itsCvImg, cv::COLOR_YUV2RGB_YUYV);
299
300 // Also make a raw YUYV copy of the crop for later displays:
301 crop.copyTo(itsRawInputCv);
302
303 // Let camera know we are done processing the input image:
304 inframe.done();
305
306 // Launch the predictions; will throw if network is not ready:
307 try
308 {
309 int netinw, netinh, netinc; itsDarknet->getInDims(netinw, netinh, netinc); // will throw if not ready
310 itsPredictFut = jevois::async([&]() { return itsDarknet->predict(itsCvImg, itsResults); });
311 }
312 catch (std::logic_error const & e) { itsRawPrevOutputCv.release(); } // network is not ready yet
313 }
314
315 // Show processing fps:
316 std::string const & fpscpu = timer.stop();
317 jevois::rawimage::writeText(outimg, fpscpu, 3, h - 13, jevois::yuyv::White);
318
319 // Send the output image with our processing results to the host over USB:
320 outframe.send();
321 }
322
323 // ####################################################################################################
324 protected:
325 std::shared_ptr<Darknet> itsDarknet;
326 std::vector<jevois::ObjReco> itsResults;
327 std::future<float> itsPredictFut;
329 cv::Mat itsCvImg;
331};
332
333// Allow the module to be loaded as a shared object (.so) file:
JEVOIS_REGISTER_MODULE(ArUcoBlob)
int h
#define success()
Identify objects using Darknet deep neural network.
cv::Mat itsRawPrevOutputCv
std::shared_ptr< Darknet > itsDarknet
cv::Mat itsRawInputCv
virtual void process(jevois::InputFrame &&inframe, jevois::OutputFrame &&outframe) override
Processing function with video output to USB.
DarknetSingle(std::string const &instance)
Constructor.
std::future< float > itsPredictFut
virtual void postUninit() override
Un-initialization.
std::vector< jevois::ObjReco > itsResults
virtual ~DarknetSingle()
Virtual destructor for safe inheritance.
virtual void process(jevois::InputFrame &&inframe) override
Processing function, no video output.
unsigned int width
unsigned int height
void require(char const *info, unsigned int w, unsigned int h, unsigned int f) const
void sendSerialObjReco(std::vector< ObjReco > const &res)
StdModule(std::string const &instance)
std::string const & stop(double *seconds)
#define LFATAL(msg)
#define LINFO(msg)
void paste(RawImage const &src, RawImage &dest, int dx, int dy)
cv::Mat cvImage(RawImage const &src)
void writeText(RawImage &img, std::string const &txt, int x, int y, unsigned int col, Font font=Font6x10)
void drawFilledRect(RawImage &img, int x, int y, unsigned int w, unsigned int h, unsigned int col)
std::future< std::invoke_result_t< std::decay_t< Function >, std::decay_t< Args >... > > async(Function &&f, Args &&... args)
std::string sformat(char const *fmt,...) __attribute__((format(__printf__
unsigned short constexpr Black
unsigned short constexpr White