The chest X-ray is a radiological clinical assessment tool used to detect different types of lung diseases, such as lung tumors. We use SDFN (Segmentation-based Deep Fusion Networks) and SE (Squeeze and Excitation) Blocks for model training, using a combination of whole and cropped lung X-ray images, which also assists in improving the model’s attention, avoiding issues that could be introduced by image misalignment and unwanted objects, as well as the loss of small targets after image resizes. Two CNNs are used for feature extraction, with the extracted features being stitched together to form the final output that is used to determine whether lung tumors are present. Unlike previous methods of identifying lesion hotspots from X-ray images, we use SEG-GRAD-CAM to generate heatmaps of the lung tumors for localization. From experimental results, we achieved a 98.51% accuracy and 99.01% sensitivity in classifying chest X-ray images with and without tumors. The method can reduce errors caused by differing judgments between radiologists, and assist them in making medical decisions.