Deep-AI uniquely provides an integrated, holistic and accelerated training and inference deep learning solution for the edge

Supported neural networks include Resnet and Resnet-like (any layer depths and widths) for classification, Yolo, TinyYolo and SSD for object detection, as well as MLP (multi-layer perceptron) models for signal and data analysis.

Our solution runs on Xilinx Alveo PCIe cards, certified and available on a variety of standard servers from leading server vendors. The same hardware is used for inference and retraining of the deep learning model, allowing an on-going iterative process that keeps the model updated to the new data that is continuously generated.
Deep-AI’s solution for integrated training and inference at the edge brings numerous benefits when implementing and deploying an AI application:
  • Same H/W for training & inference for lower cost, power and h/w footprint
  • Seamless, hands-free switch training <> inference
  • No need to send data back to cloud or data center

Furthermore, in most systems training is done at 32bit floating-point while there is a growing need to run inference at 8bit fixed-point. In these cases, one needs to manually run challenging as well as time and resource consuming quantization processes to convert the 32bit training output into an 8bit inference input. Moreover, this conversion often results in loss of accuracy. Because our Training output is Inference-ready (also to 3rd party inference systems):

  • The fixed-point 8-bit output feeds directly to inference
  • No processing needed to quantize training output before inference
  • And no loss of accuracy moving from training to inference
Furthermore, because we use 8bit format and high sparsity for training, the resulting model for inference is up to 95% smaller enabling storage savings for systems that are short in memory.