In today’s world, mobile communication is everything. We are surrounded by apps for audio and video calls, meetings, and broadcasts. With the pandemic, it’s not just business meetings that have moved from meeting rooms to calling apps. Calls to family, concerts, and even consultations with doctors are all now available on apps.
In this article we’ll cover the features every communication app should have, whether it’s a small program for calls or a platform for business meetings and webinars, and in the following articles, we’ll show you some examples of how to implement them.
Incoming call notification
Apps can send notifications to notify you of something important. There’s nothing more important for a communication app than an incoming call or a scheduled conference that the user forgot about.
So any app with call functionality has to use this mechanism to notify. Of course, we can show the name and the photo of the caller. Also, for the user’s convenience, we can add buttons to answer or reject the call without unnecessary clicks and opening the app.
You can go even further and change the notification design provided by the system.
However, options for Android devices don’t end here. Show a full-screen notification with your design even if the screen is locked! Read the guide on how to make your Android call notification here.
A notification that does not allow to close the process
The call may take a long time, so the user decides to do something at the same time. He will open another application, for example, a text document. At this moment an unpleasant surprise awaits us: if the system does not have enough resources to display this application, it may simply close ours without a warning! Therefore, the call will be terminated, leaving the user very confused.
Fortunately, there is a way to avoid this by using the Foreground Service mechanism. We mark our application as being actively used by the user even if it is minimized. After that, the application might get closed only in the most extreme case, if the system runs out of resources even for the most crucial processes.
The system, for security reasons, requires a persistent small notification, letting the user know that the application is performing work in the background.
It is essentially a normal notification, albeit with one difference: it can’t be swiped away. You don’t need to worry about accidentally wiping it away, so the application is once again defenseless against the all-optimizing system.
You can do with a very small notification:
It appears quietly in the notification panel, without showing immediately to the user, like an incoming call notification.
Nevertheless, it is still a notification, and all the techniques described in the previous paragraph apply to it – you can add buttons and customize the design
Picture-in-picture for video calls
Now the user can participate in a call or conference call and mind his own business without being afraid that the call will end abruptly. However, we can go even further in supporting multitasking!
If your app has a video call feature, you can show a small video call window (picture-in-picture) for the user’s convenience, even if they go to other app screens. And, starting from Android 8.0, we can show such a window not only in our application but also on top of other applications!
You can also add controls to this window, such as camera switching or pause buttons.
Ability to switch audio output devices
An integral part of any application with calls, video conferences, or broadcasts is audio playback. But how do we know from which audio output device the user wants to hear the sound? We can, of course, try to guess for him, but it’s always better to guess and provide a choice. For example, with this feature, the user won’t have to turn off the Bluetooth headphones to turn on the speakerphone
So if you give the user the ability to switch the audio output device at any point in the call, they will be grateful.
The implementation often depends on the specific application, but there is a method that works in almost all cases. You will learn about it in one of the next articles in this series.
A deep link to quickly join a conference or a call
For both app distribution and UX, the ability to share a broadcast or invite someone to a call or conference is useful. But it may happen that the person invited is not yet a user of your app.
Well, that won’t be for long. You can generate a special link that will take those who already have the app directly to the call to which they were invited and those who don’t have the app installed to their platform’s app store. iPhone owners will go to the App Store, and Android users will go to Google Play.
In addition, with this link, once the application is installed, it will launch immediately, and the new user will immediately get into the call to which he was invited!
We covered the main features of the system that allows us to improve the user experience when using our audio/video apps, from protecting our app from being shut down by the system right during a call, to UX conveniences like picture-in-picture mode.
Of course, every app is unique, with its own tasks and nuances, so these tips are no clear-cut rules. Nevertheless, if something from this list seems appropriate for a particular application, it’s worth implementing.
2000 IP cameras stream in our video surveillance system ipivis.com. It works at 450 US police departments, medical education, and child advocacy centers.
What features we develop for closed-circuit television (CCTV) software
📹Live video streaming from internet protocol (IP) cameras
Full HD quality – like most movies on YouTube and television. Watch it on a cinema size screen in an auditorium – students will see the details.
Video and audio are so in sync – speech pathologists work with these streams.
🔎Pan-tilt-zoom – PTZ
Pan means the camera moves left and right to show the whole room.
Tilt means movement up and down.
Zoom – enlarge the view. E.g., from the whole room observation zoom to a sheet of paper on the table, to distinguish what a patient is writing.
To use PTZ, you need to buy PTZ-enabled IP cameras.
🎬Video recording from digital IP cameras
Hit the record button when you want – or schedule recording. The stream will start recording automatically.
Set recurrence – e.g. daily, weekly. Decide for how long to repeat: end after N repetitions, endlessly, till a certain date. Set a starting position to record for PTZ cameras.
Save to any popular format, e.g. mp4. Convert.
No hardware equipment on site, except the IP cameras. We program all the functions that Digital Video Recorders (DVR) and Network Video Recorders (NVR) have. So video is processed and stored on a server. The servers are usually in the cloud – rented from providers like Amazon. However, the server can be in your local server room too.
Some IP cameras have speakers. So you can speak into your laptop mic and someone will hear you near the IP camera. Scare intruders away 🙂
📼Marks on video
Add comments on video while watching live or pre-recorded ones. Police officers mark confessions – and don’t have to re-watch the whole interrogation to find it again.
📸Closed-circuit digital photography (CCDP)
Take pictures with your IP cameras and save them.
Video of people is sensitive content and subject to law regulations. For example, doctors must access interviews of their patients only and not their colleagues’. We develop software with as many user roles as you need. When a user logs in, he only has access to the content permitted to him.
🕹Operate with hardware buttons
Push a real button on the wall to start streaming and recording. A sign “In Use” lights on. Stop the recording from your computer or the same button.
👋Movement and object recognition
Some IP cameras have movement detectors. Cameras may start recording or play some warning sound to scare away the intruders. Get an SMS or push notification about that.
Define “suspicious objects”, and the system will warn you when a camera spots one. We developed an app where military drones monitor land for opponent soldiers and cars this way. Neural networks teach the app to recognize them better and better. We recognize objects on live video with OpenCV.
Type a word, and get all spots on the video marked where it sounds. Police officers search through interrogation recordings this way. For sound recognition, we use Amazon Transcribe, one of Amazon Web Services (AWS) products.
Crop videos and save short clips. Delete unneeded parts.
Burn video on CDs. Yep, the police still use them in 2020.
Devices for which we develop VMS and video surveillance software
What IP camera to pick for a video surveillance application
Start with those that support ONVIF standards and program your software to support them.
Most IP cameras support the ONVIF standard. It’s a standard API – application programming interface. It’s a “language” a program can speak with the IP cameras.
Axis, Bosch, and Sony founded ONVIF. Most of their cameras should support it, but it is not guaranteed. Other manufacturers want their cameras to sell well – so they are interested in supporting ONVIF. However, not all the cameras are supported – the standard is voluntary.
If the camera does not support ONVIF, the support of such IP camera is programmed separately. So you can’t program once and for any camera.
So, a safe bet among IP camera brands is starting with Axis. Axis has the largest market share – your software will support more cameras than it would with any other choice. Many Bosch and Sony cameras will work too as they support ONVIF.
What industries we developed video surveillance and management software for
🔬Clinical observation and recording
🧨Military drone observation
🎰 Poker: recognition of chips in real casinos, recognition of cards in online casinos
How much it costs to develop video surveillance software
The initial working version of a video surveillance website takes us about 3 months, around USD 24,800. Add IP cameras, watch live streams, record.
However, custom software needs individual planning and estimation.
With ipivs.com we work on an ongoing basis – provide a dedicated team.
Send us a message through Request a quote. We’ll estimate the time and price for your project.
In the span of the last 10 years, the term “neural networks” has gone beyond the scientific and professional environment. The theory of neural network organization emerged in the middle of the last century, but only by 2012 the computer power has reached sufficient values to train neural networks. Thanks to this their widespread use began.
Neural networks are increasingly being used in mobile application development. The Deloitte report indicates that more than 60% of the applications installed by adults in developed countries use neural networks. According to statistics, Android has been ahead of its competitors in popularity for several years.
Neural networks are used:
to recognize and process voices (modern voice assistants),
to recognize and process objects (computer vision),
to recognize and process natural languages (natural language processing),
to find malicious programs,
to automate apps and make them more efficient. For example, there are healthcare applications that detect diabetic retinopathy by analyzing retinal scans.
What are neural networks and how do they work?
Mankind has adopted the idea of neural networks from nature. Scientists took the animal and human nervous systems as an example. A natural neuron consists of a nucleus, dendrites, and an axon. The axon transitions into several branches (dendrites), forming synapses (connections) with other neuronal dendrites.
The artificial neuron has a similar structure. It consists of a nucleus (processing unit), several dendrites (similar to inputs), and one axon (similar to outputs), as shown in the following picture:
Connections of several neurons form layers, and connections of layers form a neural network. There are three main types of neurons: input (receives information), hidden (processes information), and output (presents results of calculations). Take a look at the picture.
Neurons on different levels are connected through synapses. During the passage through a synapse, the signal can either strengthen or weaken. The parameter of a synapse is a weight – some coefficient can be any real number, due to which the information can change. Numbers (signals) are input, then they are multiplied by weights (each signal has its own weight) and summed. The activation function calculates the output signal and sends it to the output (see the picture).
Imagine the situation: you have touched a hot iron. Depending on the signal that comes from your finger through the nerve endings to the brain, it will make a decision: to pass the signal on through the neural connections to pull your finger away, or not to pass the signal if the iron is cold and you can leave the finger on it. The mathematical analog of the activation function has the same purpose. The activation function allows signals to pass or fail to pass from neuron to neuron depending on the information they pass. If the information is important, the function passes it through, and if the information is little or unreliable, the activation function does not allow it to pass on.
How to prepare neural networks for usage?
Work with neural nets goes through several stages:
Preparation of a neural network, which includes the choice of architecture (how neurons are organized), topology (the structure of their location relative to each other and the outside world), the learning algorithm, etc.
Loading the input data into a neural network.
Training a neural network. This is a very important stage, without which the neural network is useless. This is where all the magic happens: along with the input data volume fed in, the neuronet receives information about the expected result. The result obtained in the output layer of the neural network is compared with the expected one. If they do not coincide, the neural network determines which neurons affected the final value to a greater extent and adjusts weights on connections with these neurons (so-called error backpropagation algorithm). This is a very simplified explanation. We suggest reading this article to dive deeper into neural network training. Neural network training is a very resource-intensive process, so it is not done on smartphones. The training time depends on the task, architecture, and input data volume.
Checking training adequacy. A network does not always learn exactly what its creator wanted it to learn. There was a case where the network was trained to recognize images of tanks from photos. But since all the tanks were on the same background, the neural network learned to recognize this type of background, not the tanks. The quality of neural network training must be tested on examples that were not involved in its training.
Using a neural network – developers integrate the trained model into the application.
Limitations of neural networks on mobile devices
Most mid-range and low-end mobile devices available on the market have between 2 and 4 GB of RAM. And usually, 1/3 of this capacity is reserved by the operating system. The system can “kill” applications with neural networks as they run when the RAM limit approaches.
The size of the application
Complex deep neural networks often weigh several gigabytes. When integrating a neural network into mobile software there is some compression, but it is still not enough to work comfortably. The main recommendation for the developers is to minimize the size of the application as much as possible on any platform to improve the UX.
Simple neural networks often return results almost instantly and are suitable for real-time applications. However, deep neural networks can take dozens of seconds to process a single set of input data. Modern mobile processors are not yet as powerful as server processors, so processing results on a mobile device can take several hours.
To develop a mobile app with neural networks, you first need to create and train a neural network on a server or PC, and then implement it in the mobile app using off-the-shelf frameworks.
Working with a single app on multiple devices
As an example, a facial recognition app is installed on the user’s phone and tablet. It won’t be able to transfer data to other devices, so neural network training will happen separately on each of them.
Overview of neural network development libraries for Android
TensorFlow is an open-source library from Google that creates and trains deep neural networks. With this library, we store a neural network and use it in an application.
The library can train and run deep neural networks to classify handwritten numbers, recognize images, embed words, and process natural languages. It works on Ubuntu, macOS, Android, iOS, and Windows.
To make learning TensorFlow easier, the development team has produced additional tutorials and improved getting started guides. Some enthusiasts have created their own TensorFlow tutorials (including InfoWorld). You can read several books on TensorFlow or take online courses.
We mobile developers should take a look at TensorFlow Lite, a lightweight TensorFlow solution for mobile and embedded devices. It allows you to do machine learning inference on the device (but not training) with low latency and small binary size. TensorFlow Lite also supports hardware acceleration using the Android neural network API. TensorFlow Lite models are compact enough to run on mobile devices and can be used offline.
TensorFlow Lite runs fairly small neural network models on Android and iOS devices, even if they are disabled.
The basic idea behind TensorFlow Lite is to train a TensorFlow model and convert it to the TensorFlow Lite format. The converted file can then be used in a mobile app.
TensorFlow Lite converter – converts TensorFlow models into an efficient form for usage by the interpreter, and can make optimizations to improve performance and binary file size.
TensorFlow Lite is designed to simplify machine learning on mobile devices themselves instead of sending data back and forth from the server. For developers, machine learning on the device offers the following benefits:
response time: the request is not sent to the server, but is processed on the device
privacy: the data does not leave the device
Internet connection is not required
the device consumes less energy because it does not send requests to the server
Firebase ML Kit
TensorFlow Lite makes it easier to implement and use neural networks in applications. However, developing and training models still requires a lot of time and effort. To make life easier for developers, the Firebase ML Kit library was created.
The library uses already trained deep neural networks in applications with minimal code. Most of the models offered are available both locally and on Google Cloud. Developers can use models for computer vision (character recognition, barcode scanning, object detection). The library is quite popular. For example, it is used in:
Yandex.Money (a Russian e-commerce system) to recognize QR codes;
FitNow, a fitness application that recognizes texts from food labels for calorie counting;
TutboTax, a payment application that recognizes document barcodes.
ML Kit also has:
language detection of written text;
translation of texts on the device;
smart message response (generating a reply sentence based on the entire conversation).
In addition to methods out of the box, there is support for custom models.
What’s important is that you don’t need to use any services, APIs, or backend for this. Everything can be done directly on the device – no user traffic is loaded and developers don’t need to handle errors in case there is no internet connection. Moreover, it works faster on the device. The downside is the increased power consumption.
Developers don’t need to publish the app every time after updates, as ML Kit will dynamically update the model when it goes online.
The ML Kit team decided to invest in model compression. They are experimenting with a feature that allows you to upload a full TensorFlow model along with training data and get a compressed TensorFlow Lite model in return. Developers are looking for partners to try out the technology and get feedback from them. If you’re interested, sign up here.
Since this library is available through Firebase, you can also take advantage of other services on that platform. For example, Remote Config and A/B testing make it possible to experiment with multiple user models. If you already have a trained neural network loaded into your application, you can add another one without republishing it to switch between them or use two at once for the sake of experimentation – the user won’t notice.
Problems of using neural networks in mobile development
Developing Android apps that use neural networks is still a challenge for mobile developers. Training neural networks can take weeks or months since the input information can consist of millions of elements. Such a serious workload is still out of reach for many smartphones.
Check to see if you can’t avoid having a neural network in a mobile app if:
there are no specialists in your company who are familiar with neural networks;
your task is quite non-trivial, and to solve it you need to develop your own model, i.e. you cannot use ready-made solutions from Google, because this will take a lot of time;
the customer needs a quick result – training neural networks can take a very long time;
the application will be used on devices with an old version of Android (below 9). Such devices do not have enough power.
Neural networks became popular a few years ago, and more and more companies are using this technology in their applications. Mobile devices impose their own limitations on neural network operation. If you decide to use them, the best choice would be a ready-made solution from Google (ML Kit) or the development and implementation of your own neural network with TensorFlow Lite.