CNN Architecture for Surgical Image Segmentation with Recursive Structure and Flip-Based Upsampling

Taito Manabe, Koki Tomonaga, Koki Fujita, Yuichiro Shibata, Taiichiro Kosaka, Tomohiko Adachi


Laparoscopic surgery, a less invasive camera-aided surgery, is now performed commonly. However, it requires a camera assistant who holds and maneuvers a laparoscope. By controlling the laparoscope automatically using a robot, a surgeon can perform the operation without a camera assistant, which would be beneficial in areas suffering from lack of surgeons. In this paper, a prototype image segmentation architecture based on a convolutional neural network (CNN) is proposed to realize an automated laparoscope control for cholecystectomy. Since a training dataset is annotated manually by a few surgeons, its scale is limited compared to common CNN-based systems. Therefore, we built a recursive network structure, with some sub-networks which are used multiple times, to mitigate overfitting. In addition, instead of the common transposed convolution, the flip-based subpixel reconstruction is introduced into upsampling layers. Furthermore, we applied stochastic depth regularization to the recursive structure for better accuracy. Evaluation results revealed that these improvements bring better classification accuracy without increasing the number of parameters. The system shows a throughput sufficient for real-time laparoscope robot control with a single NVIDIA GeForce GTX 1080 GPU.


laparoscopic surgery; semantic segmentation; CNN; recursive structure

Full Text:



  • There are currently no refbacks.