Validating a Proposed Data Mining Approach (SLDM) for Motion Wearable Sensors to Detect the Early Signs of Lameness in Sheep

Student thesis: Doctoral Thesis


Lameness can be described as painful erratic movements, which relate to a locomotor system and result in the animal deviating from its normal gait or posture. Lameness is considered one of the major health and welfare concerns for the sheep industry in the UK that leads to a substantial economic problem and causes a reduction in overall farm productivity. According to a report in 2013 by ADAS entitled ‘Economic Impact of Health and Welfare Issues in Beef, Cattle and Sheep in England’, each lame ewe costs £89.80 due to the decline in body condition, lambing percentage, growth rate, and reduced fertility. Thus, early lameness detection eliminates the negative impact of lameness and increase the chance of favourable outcome from treatment. The development of wearable sensor technologies enables the idea of remotely monitoring the changes in animal behaviours or movements which relate to lameness.

The aim of this thesis was to evaluate the feasibility and accessibility of a proposed data mining approach (SLDM) to detect the early signs of lameness in sheep via analysing the retrieved data from a mounted wearable motion sensor within a sheep’s neck collar through investigating the most cost effective factors that contribute to lameness detection within the whole data mining process including; sensor sampling rate, segmentation methods, window size, extracted features, feature selection methods, and applicable classification algorithm. Three classes are recognised for sheep while their walking throughout the data collection process (sound, mild, and severe lameness classes). The sheep data were collected using three different sensor applications (Sheep Tracker, SensoDuino, SensorLog) which collect sheep data movements at different sampling rates 10, 5, and 4 Hz. Various sensing data were retrieved in X,Y, and Z dimensions; however, only accelerometer, gyroscope, and orientation readings are considered in the current study. Four sheep datasets are aggregated each of which includes 31, 10, 18, and 7 sheep. The conducted work in this thesis evaluates the performance of ensemble classifiers (Bagging, Boosting, or RusBoosting) using three different validation methods (5-fold, 0.3 hold-out, and proposed one ‘Single Sheep Splitting’) in comparison to three sampling rates (10, 5, 4 Hz), two segmentation approaches (FNSW and FOSW), three feature selection methods (ReliefF, GA, and RF) and three window sizes (10, 7, 5 sec.).

Promising results of lameness prediction accuracies are achieved over most of the combinations (3 sampling rates, two segmentation methods, 3 window sizes, 183 extracted features, 3 feature selection methods, 3 ensemble classification models, and 3 model validation methods). However, the highest accuracy is revealed by using the `Bagging ensemble classifier 88.92% with F-score of 87.7%, 91.1%, 88.2% for sound walking, mildly walking, and severely walking classes, respectively. The results are obtained using 5-fold cross-validation over a 10 sec.window for sheep data collected at 10 Hz sampling rate using only the accelerometer hardware sensor reading and calculated orientation readings. The number of features selected is 46 optimised by GA using CHAID tree as a fitness function. Conversely, the lowest prediction accuracy of 56.25% with F-score (63.4% sound walking, 51.9% mildly walking, 48.8% severely walking) is recorded when RusBoosting ensemble is applied using 5-fold cross-validation over a 10 sec.window for dataset collected at the 4 Hz. sampling rate.

So, the major research findings recommend that 10 Hz sampling rate is adequate for collect sheep movements, while the best segmentation method is FOSW as 20% of data-points are shared between two successive windows. Whereas, the preferable number of data-points (sheep movements) to be pre-processed is around 100, which is obtained when a 10 sec.window size or 7 sec.window size is applied. Additionally, the 20 features selected by RF out of 183 features could reveal good accuracy results compared to the whole set of extracted features. Although that GA feature selection method has slower execution time than RF, competitive prediction accuracy could be achieved when the selected features by GA were fed to the classifier. Finally, the acceleration sensor data alone are capable of making the decision about the lame sheep. So no extra hardware sensors like Gyroscope is required for decision making; moreover, the orientation sensor features could be directly derived from Acc which contribute most to lameness detection.

Since the most cost effective factors are identified in this research, the practice in the meanwhile could be applicable for farmers, stakeholders, and manufacturers as no available sensor to detect the lame sheep developed yet. Therefore, the multidisciplinary nature of the conducted research opens diverse paths towards applying further research studies to develop various data mining approaches and practical sensor kits to detect the early signs of sheep’s lameness for better farm productivity and sheep industry prosperity in the UK.
Date of AwardJul 2020
Original languageEnglish
Awarding Institution
  • University of Northampton
SupervisorScott Turner (Supervisor), Ali Al-Sherbaz (Supervisor) & Wanda McCormick (Supervisor)


  • motion wearable sensors
  • sensor data mining
  • supervised machine learning
  • CART ensemble classifier
  • sheep lameness detection
  • sheep behaviour classification

Cite this

Validating a Proposed Data Mining Approach (SLDM) for Motion Wearable Sensors to Detect the Early Signs of Lameness in Sheep
Al-Rubaye, Z. (Author). Jul 2020

Student thesis: Doctoral Thesis