AI and machine learning have grown exponentially over the last couple of years. Today, many problems can be solved or improved using this technology. For instance, neural networks have proven to be highly efficient in regard to classifying objects in video sequences or images. Most often, 2D data is used in these applications. In recent years, major advancements have been made in equipment that can generate 3D data. Thus, adding an additional dimension which could potentially help improve tasks like object classification. This thesis evaluates 3D data in the form of point clouds generated by stereo cameras, we present two vehicle classification neural networks, both based on point clouds. Our first network is referred to as the 3D model. This model uses raw point cloud data as input, thus fully utilizing the information point clouds provides. The second network is referred to as the 2D model and bases its input on projections from point clouds. The 3D model is based on the architecture of PointNet: a network developed by the pioneers of deep neural networks on raw point cloud data. We utilize their approach on applying deep learning directly over irregular point clouds without any conversion. In particular, applying a Multi-Layer-Perceptron network and a symmetric function on each point. The 2D model is based on the architecture of another well-known network called VGG16. This model uses 2D images as input. The images are generated by converting point clouds into voxels and calculating a density value in each voxel. We evaluate the performance of each created model separately to identify strength and weaknesses. Moreover, we assess whether raw point clouds can achieve on par or better performance than projected point clouds. Empirically, both proposed models show strong performance in the task of classifying vehicles, exceeding an accuracy of 98%. Furthermore, both models are lightweight in terms of network parameters and fast in regard to inference time. In this thesis, we show that raw 3D point cloud data is as effective as 2D image data when used as input and requires less pre-processing. Furthermore, we show that relatively few points are required as input to ensure reliable classifications. We conclude that no model is superior to the other as evaluation shows that both models are relatively equal in performance.