Segmentation
Routines for image segmentation.
SegmentationEM
Bases: BaseImageSegmentationAggregator
The Segmentation EM-algorithm performs a categorical
aggregation task for each pixel: should it be included in the resulting aggregate or not.
This task is solved by the single-coin Dawid-Skene algorithm.
Each worker has a latent parameter skill
that shows the probability of this worker to answer correctly.
Skills and true pixel labels are optimized by the Expectation-Maximization algorithm: 1. E-step. Estimates the posterior probabilities using the specified workers' segmentations, the prior probabilities for each pixel, and the workers' error probability vector. 2. M-step. Estimates the probability of a worker answering correctly using the specified workers' segmentations and the posterior probabilities for each pixel.
D. Jung-Lin Lee, A. Das Sarma and A. Parameswaran. Aggregating Crowdsourced Image Segmentations. CEUR Workshop Proceedings. Vol. 2173, (2018), 1-44.
https://ceur-ws.org/Vol-2173/paper10.pdf
Examples:
>>> import numpy as np
>>> import pandas as pd
>>> from crowdkit.aggregation import SegmentationEM
>>> df = pd.DataFrame(
>>> [
>>> ['t1', 'p1', np.array([[1, 0], [1, 1]])],
>>> ['t1', 'p2', np.array([[0, 1], [1, 1]])],
>>> ['t1', 'p3', np.array([[0, 1], [1, 1]])]
>>> ],
>>> columns=['task', 'worker', 'segmentation']
>>> )
>>> result = SegmentationEM().fit_predict(df)
Source code in crowdkit/aggregation/image_segmentation/segmentation_em.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
eps = 1e-15
class-attribute
instance-attribute
The convergence threshold.
errors_ = attr.ib(init=False)
class-attribute
instance-attribute
The workers' error probability vector.
loss_history_ = attr.ib(init=False)
class-attribute
instance-attribute
A list of loss values during training.
n_iter = attr.ib(default=10)
class-attribute
instance-attribute
The maximum number of EM iterations.
posteriors_ = attr.ib(init=False)
class-attribute
instance-attribute
The posterior probabilities for each pixel to be included in the resulting aggregate. Each probability is in the range from 0 to 1, all probabilities must sum up to 1.
priors_ = attr.ib(init=False)
class-attribute
instance-attribute
The prior probabilities for each pixel to be included in the resulting aggregate. Each probability is in the range from 0 to 1, all probabilities must sum up to 1.
segmentation_region_size_ = attr.ib(init=False)
class-attribute
instance-attribute
Segmentation region size.
segmentations_sizes_ = attr.ib(init=False)
class-attribute
instance-attribute
Sizes of image segmentations.
tol = attr.ib(default=1e-05)
class-attribute
instance-attribute
The tolerance stopping criterion for iterative methods with a variable number of steps.
The algorithm converges when the loss change is less than the tol
parameter.
fit(data)
Fits the model to the training data with the EM algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
Returns:
Name | Type | Description |
---|---|---|
SegmentationEM |
SegmentationEM
|
self. |
Source code in crowdkit/aggregation/image_segmentation/segmentation_em.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 |
|
fit_predict(data)
Fits the model to the training data and returns the aggregated segmentations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
Returns:
Name | Type | Description |
---|---|---|
Series |
Series[Any]
|
Task segmentations. The |
Source code in crowdkit/aggregation/image_segmentation/segmentation_em.py
219 220 221 222 223 224 225 226 227 228 229 230 |
|
SegmentationMajorityVote
Bases: BaseImageSegmentationAggregator
The Segmentation Majority Vote algorithm chooses a pixel if and only if the pixel has "yes" votes from at least half of all workers.
This method implements a straightforward approach to the image segmentation aggregation:
it assumes that if a pixel is not inside the worker's segmentation, this vote is considered to be equal to 0.
Otherwise, it is equal to 1. Then the SegmentationEM
algorithm aggregates these categorical values
for each pixel by the Majority Vote algorithm.
The method also supports the weighted majority voting if the skills
parameter is provided for the fit
method.
D. Jung-Lin Lee, A. Das Sarma and A. Parameswaran. Aggregating Crowdsourced Image Segmentations. CEUR Workshop Proceedings. Vol. 2173, (2018), 1-44.
https://ceur-ws.org/Vol-2173/paper10.pdf
Examples:
>>> import numpy as np
>>> import pandas as pd
>>> from crowdkit.aggregation import SegmentationMajorityVote
>>> df = pd.DataFrame(
>>> [
>>> ['t1', 'p1', np.array([[1, 0], [1, 1]])],
>>> ['t1', 'p2', np.array([[0, 1], [1, 1]])],
>>> ['t1', 'p3', np.array([[0, 1], [1, 1]])]
>>> ],
>>> columns=['task', 'worker', 'segmentation']
>>> )
>>> result = SegmentationMajorityVote().fit_predict(df)
Source code in crowdkit/aggregation/image_segmentation/segmentation_majority_vote.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
default_skill = attr.ib(default=None)
class-attribute
instance-attribute
Default worker weight value.
on_missing_skill = attr.ib(default='error')
class-attribute
instance-attribute
A value which specifies how to handle assignments performed by workers with an unknown skill.
Possible values:
* error
: raises an exception if there is at least one assignment performed by a worker with an unknown skill;
* ignore
: drops assignments performed by workers with an unknown skill during prediction,
raises an exception if there are no assignments with a known skill for any task;
* value
: the default value will be used if a skill is missing.
skills_ = named_series_attrib(name='skill')
class-attribute
instance-attribute
The workers' skills. The pandas.Series
data is indexed by worker
and has the corresponding worker skill.
fit(data, skills=None)
Fits the model to the training data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
skills
|
Series
|
The workers' skills. The |
None
|
Returns:
Name | Type | Description |
---|---|---|
SegmentationMajorityVote |
SegmentationMajorityVote
|
self. |
Source code in crowdkit/aggregation/image_segmentation/segmentation_majority_vote.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
|
fit_predict(data, skills=None)
Fits the model to the training data and returns the aggregated segmentations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
skills
|
Series
|
The workers' skills. The |
None
|
Returns:
Name | Type | Description |
---|---|---|
Series |
Series[Any]
|
Task segmentations. The |
Source code in crowdkit/aggregation/image_segmentation/segmentation_majority_vote.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
SegmentationRASA
Bases: BaseImageSegmentationAggregator
The Segmentation RASA (Reliability Aware Sequence Aggregation) algorithm chooses a pixel if the sum of the weighted votes of each worker is more than 0.5.
The Segmentation RASA algorithm consists of three steps: 1. Performs the weighted Majority Vote algorithm. 2. Calculates weights for each worker from the current Majority Vote estimation. 3. Performs the Segmentation RASA algorithm for a single image.
The algorithm works iteratively. At each step, the workers are reweighted in proportion to their distances
from the current answer estimation. The distance is calculated as \(1 - IOU\), where IOU
(Intersection over Union) is an extent of overlap of two boxes.
This algorithm is a modification of the RASA method for texts.
J. Li, F. Fukumoto. A Dataset of Crowdsourced Word Sequences: Collections and Answer Aggregation for Ground Truth Creation. Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP, (2019), 24-28.
https://doi.org/10.18653/v1/D19-5904
Examples:
>>> import numpy as np
>>> import pandas as pd
>>> from crowdkit.aggregation import SegmentationRASA
>>> df = pd.DataFrame(
>>> [
>>> ['t1', 'p1', np.array([[1, 0], [1, 1]])],
>>> ['t1', 'p2', np.array([[0, 1], [1, 1]])],
>>> ['t1', 'p3', np.array([[0, 1], [1, 1]])]
>>> ],
>>> columns=['task', 'worker', 'segmentation']
>>> )
>>> result = SegmentationRASA().fit_predict(df)
Source code in crowdkit/aggregation/image_segmentation/segmentation_rasa.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 |
|
loss_history_ = attr.ib(init=False)
class-attribute
instance-attribute
A list of loss values during training.
mv_ = attr.ib(init=False)
class-attribute
instance-attribute
The weighted task segmentations calculated with the Majority Vote algorithm.
n_iter = attr.ib(default=10)
class-attribute
instance-attribute
The maximum number of iterations.
tol = attr.ib(default=1e-05)
class-attribute
instance-attribute
The tolerance stopping criterion for iterative methods with a variable number of steps.
The algorithm converges when the loss change is less than the tol
parameter.
weights_ = attr.ib(init=False)
class-attribute
instance-attribute
A list of workers' weights.
fit(data)
Fits the model to the training data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
Returns:
Name | Type | Description |
---|---|---|
SegmentationRASA |
SegmentationRASA
|
self. |
Source code in crowdkit/aggregation/image_segmentation/segmentation_rasa.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
|
fit_predict(data)
Fits the model to the training data and returns the aggregated segmentations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
The training dataset of workers' segmentations
which is represented as the |
required |
Returns:
Name | Type | Description |
---|---|---|
Series |
Series[Any]
|
Task segmentations. The |
Source code in crowdkit/aggregation/image_segmentation/segmentation_rasa.py
144 145 146 147 148 149 150 151 152 153 154 155 156 |
|