Tag: TG_UNDOCUMENTED
(1829 ranking factors)
Factors |
---|
TR
web_production: 1
Text relevance (Maxfreq is the frequency of the most frequent word that makes sense of the length of the document).
|
LR
web_production: 2
Weight: 0.049061648412321 Link relevance. The factor will be remarked.
|
TRp1
web_production: 4
Stript priority for TR is a text priority - there are all the words of the request somewhere in the document (while they pass contextual restrictions on the request, for example, both words DB in one sentence).
|
TRp2
web_production: 5
Weight: -0.109820338929289 PHRASE priority for TR is a text priority - there are all the words of the request in a row in the document.
|
LRp1
web_production: 6
(strict) there is all the words of the request in one link.
|
LRp2
web_production: 7
Weight: 0.019119257307239 (Phrase) There are all the words of the request in a row in one line.
|
TRtitle
web_production: 8
The presence of an accurate phrase (request text) in the header (more precisely, in the first sentence of the document). Contextual restrictions and feet are taken into account exactly as in TRP2, i.e. Factor [8] Minors Factor [5]
|
TRhr
web_production: 9
There was a plot that passed the quorum in which all the word positions are designated as those who have the relevance of Best_relev (title or Meta Keywords).
|
News
web_production: 11
This is the news (determined by the characteristic ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushichiekomponenty/klassificacionnye?v=tkd#h45859-3 Patterns in URL URL)))).
|
Cat
web_production: 13
This is a catalog (determined by the characteristic ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayafformula/tekushhiekomponenty/klassificacionnye? .
|
Long
web_production: 15
Weight: -0.084798680877042 Long document (the longer the document, the greater the value of the factor).
|
TRhitw
web_production: 16
Hitweigt is a variant of textual relevance, in which the weights of all hits are considered equal (i.e., they do not take into account the allowances for title and the proximity of words). In this case, the corresponding hits must be restricted by the syntactic sorcerer, i.e. We can assume that the TRHITW factor is 0 and only when Softandok is 0
|
LongQuery
web_production: 17
Weight: 0.030334786608805 The amount of IDF words of the request. The name does not reflect the essence: for example, for the request of 'Gadyach' this factor will be more than for the request of 'Moscow Peter Yekaterinburg Samara'.
|
Geo
web_production: 22
Means the coincidence of the region of the user and the site at the level of countries. Binar factor: 1-rush, 0-no. It is based on ((http://wiki.yandex-team.ru/ Yandexposisk/ Classification of Sytraitniki/ Geographic/Sospolzanievpoysk Geoklassification of sites)))))))
|
SubqueryThMatch
web_production: 23
Coincidence of thematic spectra of request and document. Request themes-the result of work ((http://wiki.yandex-team.ru/evgenijjkroxalev/subquery Rules of the sorcerer Subquerysearch)) The subject of the document is taken from Yandex-Catalog
|
SR
web_production: 24
Weight: 0.049845924868959 The complex Static Rank is assembled from static components according to a separate formula ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/#oftnd1 *))).
|
TRref
web_production: 25
The factor about the number of Refines. In the queries, there is a feature of user refines ('' word that is faced with a percentage sign '). According to the idea, this means something like 'it would be good if the word in the document was'. The only famous ((http://staff.yandex-team.ru/gulin Andrey Gulin)) the valuable use of this feature is a request [ %official %site name of the film]. This feature is unknown to users, because Not described in any documentation. It is planned that it will disappear from the tongue of requests, but in the sorcerer the words with the priority of User_refine will remain. The factor indicates how much the maximum user_refine was simultaneously found in the framework of a single hit in the quorum. It is believed that there are from 0 to 3 (if> 3, then it is believed that 3). This number is waved in the half interval [0.1)
|
TRboost
web_production: 26
The number for which some linseed factors are multiplied (namely, factors number 6, 7, 47, 66), if text relevant 0, and there are few links
|
TRLRlemma
web_production: 27
In textual relevance, Lemma coincides.
|
LRHitNum100
web_production: 31
Weight: 0.033485833700259 The transformed number of words of the request in all url linos.
|
LRHitNumGt16
web_production: 32
The document LR> 20 The number of words of the words of the request in the Links> 16, the factor about LR.
|
PctLinks
web_production: 33
Weight: -0.141668202468497 For documents with a high LR, a normalized lincat relevance excluding proximity, for documents with a low LR 0
|
HasLR
web_production: 34
URL High LR.
|
TRUnmapped
web_production: 39
TR divided by a cube of the number of words in a request and transformed by a standard REMAPTR.
|
RusLang
web_production: 40
The language of the document is Russian.
|
AddTime
web_production: 41
Weight: 0.006691168756865 The time of adding a page, more - a more old document; The root is placed from time displayed at the interval [0.1] so that 3+ years gives 1.
|
IsMainPage
web_production: 42
If the main page of the owner (most often a second -level domain, for example xxxx.ru), then the factor is 1. For bums, hosting, personal blogs, etc. (for example, Lifejornal, People.ru, etc.) - domains of the third level (such as xxxxx.narod.ru) will also have an equal factor 1.
|
AddTimeMP
web_production: 43
The time for adding the main page of the owner (host?) Will be remaped like Addtime.
|
TextBM25
web_production: 46
Simple BM25 in text.
|
LinkBM25
web_production: 47
Simple BM25 for links, the weights of the braces are not taken into account.
|
TLBM25
web_production: 48
Weight: 0.031399776481102 Simple BM25 in text and links at the same time.
|
TLp1
web_production: 49
All the words of the request are in the text + links.
|
TxtPair
web_production: 53
Weight: -0.020921642736537 Simple BM25 in pairs of words - we take all pairs of words of the request and consider the number of their entry into the text of the document. In the quality of the weight of the pair we use the sum of the scales of words. It does not work if there is a stop-word in the request
|
LnkPair
web_production: 54
The same as txtpair, but for links; Link weights are not taken into account.
|
TxtBreak
web_production: 55
BM25 from the number of sentences in the document in which it occurs.
|
TxtHead
web_production: 56
Weight: -0.037878046829073 BM25 according to only in the heading.
|
TxtHiRel
web_production: 57
BM25 according to only with High Rel-bots ('significant', with the allocation (<b> ITP)).
|
WordCount
web_production: 59
Min (number of words of request/10, 1.f)
|
InvWordCount
web_production: 60
1 / quantity_lov_v_
|
HasNoTR
web_production: 61
The document has no TR.
|
HasNoLR
web_production: 62
The document has no LR.
|
Hops
web_production: 65
The number of hops of Url inpans (such as less - closer to the muzzle, the lower the value (0 - the muzzle, 1 - from the muzzle cannot be reached, 0 <can get from the muzzle <1). Normal value for the root of the nosta 0.0039).
|
LogLR
web_production: 66
Weight: 0.026926509552263 Logarithm from LR, linearly displayed in [0.1].
|
TxtPairEx
web_production: 67
Weight: -0.00667940021707 the presence of pairs of words in the exact form
|
TxtBreakEx
web_production: 68
Weight: 0.024006117828321 the number of sentences in which there are many words in the exact form
|
TxtHeadEx
web_production: 69
Weight: -0.03957553241619 the presence of words in the header in the exact form
|
TxtHiRelEx
web_production: 70
BM25 in the exact form
|
TxtBm25Ex
web_production: 71
Simple BM25 in the exact form.
|
TxtPairSy
web_production: 72
Weight: -0.022152880819573 the presence of pairs of words taking into account synonyms (> = txtpair)
|
TxtBreakSy
web_production: 73
Weight: -0.116819481337211 the number of sentences in which there are many words taking into account synonyms
|
TxtHeadSy
web_production: 74
Weight: -0.012919083353605 the presence of words in the header, taking into account synonyms
|
TxtHiRelSy
web_production: 75
Weight: -0.039215257302626 BM25 taking into account synonyms
|
TxtBm25Sy
web_production: 76
Simple BM25 taking into account synonyms.
|
XLRp0
web_production: 81
There are all the words of the request in the links
|
XLRp1
web_production: 82
There are all the words of the request in one link
|
XLRp2
web_production: 83
Weight: 0.0051601584234 There is a link that has passed quorum
|
XLRgood
web_production: 84
Weight: -0.00083343707893 What is the share of “good” links
|
XLRmanyBad
web_production: 85
How many “bad” links (bad = DPR = 0)
|
XLRmaxDpr
web_production: 86
Weight: -0.065082391728977 Maximum DPR links
|
XLRtfidf
web_production: 87
TFIDF ordinary TF*IDF by links. The frequency of the word in the links is multiplied by the reverse document frequency and summarized in all words, then it is normalized to the length of the document.
|
XLRrelev
web_production: 88
Linkovaya relevance by Gulina
|
XLRrelev200
web_production: 89
Linkovaya relevance by Gulina
|
XLRlogRelev
web_production: 90
Linkovaya relevance by Gulina
|
BFexact
web_production: 91
There is an exact form of all words of the request in the text/lincers
|
BFlemma
web_production: 92
There is a lemma of all the words of the request in the text/lincers
|
SoftAndOk
web_production: 93
The document passed Softand on the restrictions of the syntactic sorcerer. Only for documents with textual relevance. For monosyllabic requests, always 1.
|
Ukrainian
web_production: 95
It is equal to one if the site has a Ukrainian geoist (i.e. 1 - Ukrainian site)
|
IsBlog
web_production: 96
Page from the blogochosting
|
IsLivejournal
web_production: 97
Page with Livejournal.com
|
TextFeatures
web_production: 100
Weight: -0.016033504310566 The quality of the text. It is considered a rather complex formula
|
TextLike
web_production: 101
Weight: -0.094096848692163 Text quality (classifier Alekseeva)
|
MusicQ
web_production: 108
The musicality of the request. The results of the sorcerer Anton Konygin.
|
XLExactMatches
web_production: 109
The number of links that exactly coincide with a request
|
DocLen
web_production: 110
Weight: -0.065128132003719 Document length in sentences
|
UrlLen
web_production: 111
Weight: -0.001158034315755 The length of the URL, divided by 5
|
HostSize
web_production: 113
Weight: -0.032004809610482 The size of the host named after Raskovalov in the documents without taking into account the takes (each double is taken into account in the factor by an independent document)
|
IsHTML
web_production: 114
Document type - HTML
|
LinkSpeed
web_production: 115
Weight: 0.009455905387837 The number of reverse dispersion times of the appearance of links with the words of the request
|
XThLRrelev
web_production: 116
Link relevance, taking into account thematicity
|
XThLRrelev200
web_production: 117
Link relevance, taking into account thematicity
|
XThLRlogRelev
web_production: 118
Link relevance, taking into account thematicity
|
XLerfLRrelev
web_production: 119
Link relevance, taking into account the quality of each link
|
XLerfLRrelev200
web_production: 120
Link relevance, taking into account the quality of each link
|
XLerfLRlogRelev
web_production: 121
Weight: 0.060594485044371 Link relevance, taking into account the quality of each link
|
XLerfThLRlogRelev
web_production: 122
Link relevance, taking into account the quality of each link and thematicity of each link
|
XNonCommLRlogRelev
web_production: 123
Link relevance, taking into account the non -profitability of each link
|
XNonCommThLRlogRelev
web_production: 124
Link relevance, taking into account the non -profitability of each link and thematic
|
XNonCommLerfLRlogRelev
web_production: 125
Link relevance, taking into account the non -profitability of each link and quality of each link
|
XNonCommLerfThLRlogRelev
web_production: 126
Link relevance, taking into account the non -profitability of each link, the quality of each link and thematicity
|
GeoCityProxim
web_production: 127
Weight: 0.051465613603836 Means the coincidence of the region mentioned in the request and found sites at the level of areas. Binar factor: 1-rush, 0-no. It is based on ((http://wiki.yandex-team.ru/ Yandexposisk/ Classification of Sytraitniki/ Geographic/Sospolzanievpoysk Geoklassification of sites)))))))
|
LinksWithWordsPercent
web_production: 128
Weight: -0.060922780495065 The percentage of incoming links with the words of the request
|
LinksWithAllWordsPercent
web_production: 129
Weight: -0.08383112850758 The percentage of incoming links with all the words of the request
|
PornoQuery
web_production: 130
Are there any words from Yweb/Pornofilter/Porno.query.
|
IsPorno
web_production: 131
Document from porn kitski
|
IsFake
web_production: 133
Fast document
|
IsWiki
web_production: 135
page from ru.wikipedia.org
|
IsEShop
web_production: 136
Commercial page (Classifier Savina)
|
GeoRegionProxim
web_production: 137
Weight: 0.082967074248567 |
HasNoAllWordsTRSy
web_production: 138
The document does not have all the words of the request (with an accuracy to a synonym)
|
NumWordsTRSy
web_production: 139
The percentage of the words of the request in the document (with an accuracy to a synonym)
|
HasAllWordsTRSy
web_production: 140
The document has all the words of the request (with an accuracy to a synonym)
|
NumWordsLR
web_production: 141
The percentage of the words of the request in the links (with an accuracy to a synonym)
|
HasAllWordsLR
web_production: 142
There are all the words of the request in the links (with an accuracy to a synonym)
|
TxtInvPair
web_production: 144
Tr by pairs of words in the reverse order
|
LnkInvPair
web_production: 145
Lr by pairs of words of the request in the reverse order
|
TxtSkipPair
web_production: 146
Weight: -0.077504878926916 TR by pairs of words of the request through one word in texts
|
LnkSkipPair
web_production: 147
Lr by pairs of words of the request through one word in texts
|
NumWordsTRFm
web_production: 148
The percentage of all the words of the request in the text (with an accuracy to the form)
|
HasAllWordsTRFm
web_production: 149
The document has all the words of the request (with an accuracy to the form)
|
QBlog
web_production: 151
Whether the request of blog vocabulary contains
|
XGeoLRlogRelev
web_production: 152
Weight: 0.009314594460961 log (lr, narrowed to the country of the user)
|
XLerfGeoLRlogRelev
web_production: 153
Weight: 0.044511155721215 log (leerflr, narrowed to the country of the user)
|
XLExactMatchesMap
web_production: 155
The number of links that coincide with the text of the request (other Remap)
|
XLerfNormLRlogRelev
web_production: 156
Xlerflrlogrelev (normalized for the amount of LerF-wwees of all links, and not for the amount of their source scales)
|
XNonCommNormLRlogRelev
web_production: 157
Weight: 0.062474190501436 Xnoncommlrlogrelev (normalized for the amount of noncomm all links, and not for the amount of their source scales)
|
XNonCommThNormLRlogRelev
web_production: 158
Link relevance, taking into account the non -profitability of each link and thematic
|
XNonCommLerfNormLRlogRelev
web_production: 159
Xnoncommelrfnormlrlogrelev (normalized for the amount of noncommlrf-wigles of all links, and not for the amount of their source scales)
|
XNonCommLerfThNormLRlogRelev
web_production: 160
Link relevance, taking into account the non -profitability of each link, the quality of each link and thematicity
|
LinkAge
web_production: 163
Weight: 0.000426528744914 The average age of links that brought something to LR linkage = min (log (average age of links)/7, 1), 3 years are adopted for 1
|
TLen
web_production: 164
The length of the page text in the words tlen = map (number of words, 1/400), where map (x, y) = x*y / (1 + x*y)
|
IsUnreachable
web_production: 165
The page is unattainable by the links from the muzzle.
|
XLangLRlogRelev
web_production: 166
LR, taking into account the coincidence of the language and request
|
XLerfLangLRlogRelev
web_production: 167
Weight: 0.000094696411924 LR, taking into account the coincidence of the language of the link and request and accuracy
|
XLRCatalogRelev
web_production: 178
Weight: 0.0199886635755 LR for catchard descriptions
|
XLRYaCatalogRelev
web_production: 179
LR to write off in Yandex.Catalog
|
ExactWordOrderLen
web_production: 180
The length of the maximum coincidence of forms in the text and request
|
ExactWordOrderWeight
web_production: 181
Weight of maximum coincidence of forms in the text and request
|
WordOrderLen
web_production: 182
The length of the maximum coincidence in the lemma in the text and request
|
WordOrderWeight
web_production: 183
The weight of the maximum coincidence by lemma in the text and request
|
LinkMaxAge
web_production: 184
The maskimal age of a significant accumulation of links that brought something to LR
|
TRp1All
web_production: 185
Options for relevant factors taking into account the feet of words
|
LRp1All
web_production: 186
Options for relevant factors taking into account the feet of words
|
TLp1All
web_production: 187
Weight: 0.055767877134775 Options for relevant factors taking into account the feet of words
|
BFexactAll
web_production: 188
Options for relevant factors taking into account the feet of words
|
BFlemmaAll
web_production: 189
Weight: 0.059222635368125 Options for relevant factors taking into account the feet of words
|
PassageLegacyTR
web_production: 190
Weight: 0.038806477920761 TR of the best passage - how high -quality snippet
|
TxtBM25AttenSyn
web_production: 191
Weight: 0.075434934641649 Tr with discount for suggestions
|
IsForum
web_production: 196
URL satisfies forum_detector regularly
|
IsObsolete
web_production: 198
The URL has an ancient date. Ancient news are recognized. Factor 1 if there is a year in Url <= 2007.
|
TRWithStops
web_production: 199
Weight of maximum coincidence of forms in the text and request
|
LRWithStops
web_production: 200
Weight of maximum coincidence of forms in the text and request
|
HasPayments
web_production: 201
The page has a about 'payment SMS'.
|
EshopValue
web_production: 203
Weight: -0.123814718900663 Stage of the page
|
PornoValue
web_production: 204
Pornography of the page
|
GeoRelevRegionCity
web_production: 215
|
GeoRelevRegionRegion
web_production: 216
|
GeoRelevRegionCountry
web_production: 217
Weight: 0.084012276385059 Three levels of coincidence of the geography of the user and page
|
XLRGeoRelevRegionCity
web_production: 218
|
XLRGeoRelevRegionRegion
web_production: 219
|
XLRGeoRelevRegionCountry
web_production: 220
Weight: 0.042452794899003 Three levels of coincidence of the region of links and request
|
GeoCountryProxim
web_production: 221
Weight: 0.01317157982937 Geographical proximity
|
IsForeignQuery
web_production: 241
Request is not in Russian
|
PageRegionSizeIn
web_production: 243
Weight: 0.056552232052119 The size of the page of the page
|
PageRegionInvSizeIn
web_production: 244
Weight: -0.006950709230428 The factor is inversely proportional to the size of the page region
|
QueryRegionSize
web_production: 245
The size of the region of the request
|
QueryRegionInvSize
web_production: 246
The factor is inversely proportional to the size of the regional region
|
GeoGeometryProxim
web_production: 247
Weight: -0.000843495929565 The geographical proximity of the user and the site
|
XLRVideoRelev
web_production: 267
Link factor about the presence of a video on the page.
|
AuxTextBM25
web_production: 268
BM25 for the user region for localized queries, for the unflapped in Cuba, is a country. The texts of the queries sent for the regions can be viewed in Relev_regions.txt in the sorcerer
|
AuxLinkBM25
web_production: 269
The same for lingonic relevance
|
LCor
web_production: 281
Weight: 0.038372460585705 Characterizes the frequency of words in links. The factor is large, if the word that played in a lincoat relevance is rare for links.
|
SubqueryThMatchA
web_production: 282
Weight: 0.178646516342524 Coincidence of thematic spectra of request and document. Request themes - the result of work ((http://wiki.yandex-team.ru/evgenijjkroxalev/subquery Rules of the sorcerer Subquerysearch)) The subject of the document is determined by the automatic classifier
|
TRDocQuorum
web_production: 283
The weight of the words of the request that is in the text
|
LRDocQuorum
web_production: 284
The weight of the words of the request that is in the Links
|
TRLRDocQuorum
web_production: 285
The weight of the words of the request that is in the text and links
|
XPornoLRlogRelev
web_production: 289
Document Porn on the text of Leskok
|
XPornoNormLRlogRelev
web_production: 290
Document Porn on the text of Leskok, other normalization
|
XPornoQuery
web_production: 291
Classifier of Porn Causions, another dictionary than Pornoquery
|
GeoCountryCountryProxim
web_production: 293
The geographical proximity of the country of the site and the country of request
|
SpecificalQuery
web_production: 296
The request is local-specific. The request is often reformulated with the obvious task of the region. ((https://ml.yandex-team.ru/archive/thread1433892/#Message1433892 more))
|
LnkBreak
web_production: 302
Weight: 0.078872214489662 Analogs of the corresponding text factors for links. BM25 from the number of links in which a coincidence occurred.
|
LnkBm25Ex
web_production: 303
Simple BM25 in the exact form in link texts
|
LnkPairSy
web_production: 304
Weight: 0.046891090311905 The presence of pairs in the links of the words, taking into account synonyms
|
LnkBrkSy
web_production: 305
Weight: 0.035447186193336 The number of links passed the threshold
|
LnkBm25Sy
web_production: 306
Simple BM25 by links taking into account synonyms
|
VideoQuery
web_production: 307
Request about the video
|
XLRMarketRelev
web_production: 318
LR by links from Yandex.Market
|
Poetry
web_production: 319
The poetry of the document
|
PoetryQuad
web_production: 320
The maximum poetry of the quatrain
|
EngLang
web_production: 321
Document language - English
|
Has2ExactQueryParts
web_production: 322
The request is fully covered by two exact groups consisting of an exact Match of the words of a contract in a row ((http://wiki.yandex-team.ru/poiskovajaplatform/tr/coveragebygroups about coating in groups))
|
HasLevensht1QueryFragment
web_production: 323
There is a group consisting of an Exact Match of the words of the request that covers the request (possibly with a pass, addition or replacement of a word)
|
LargestSyInexactGroup
web_production: 324
Weight: -0.067337343351376 The share of the request, covered by the longest group consisting of any hits (including word forms and synonyms). Possibly with a pass, addition or replacement of a word
|
UrlHasNoDigits
web_production: 331
There are no numbers in Urla
|
SynS1
web_production: 334
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap1
web_production: 335
Weight: 0.002431406823392 Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap2
web_production: 336
Weight: 0.08033186404617 Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SyntQuality
web_production: 344
Weight: 0.010872234578071 Does the request have a complete syntactic analysis
|
PageDate
web_production: 345
Weight: -0.034716206980983 The date of the document that is registered on the page is remarkable
|
QSegmentsBM25
web_production: 351
Weight: -0.059299975637935 BM25, where the selected segments of the request act as 'words'
|
QSegmentsWeight
web_production: 352
Weight: -0.057628362537565 'Weight' of the segments of the request in the text
|
SynPercentBadWordPairs
web_production: 353
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
SynNumBadWordPairs
web_production: 354
The proportion of bad steam among all found in the table: Z/(X+1), where Z is the number of bad couples in the text, and X is (http://wiki.yandex-team.ru/evgenijgrechnikov/testsynonimizers of 2000-navigable )) steam
|
NumLatinLetters
web_production: 355
Weight: -0.086731079136512 The number of Latin letters in the text (not counting the markings), driven into [0.1] formula n/(n+100)
|
DocIdfSumFixed
web_production: 357
Previous factors - fixed
|
TitleIdfSumFixed
web_production: 358
Weight: 0.047164043400143 Previous factors - fixed
|
HeadingIdfSumFixed
web_production: 359
Weight: -0.068235863277027 Previous factors - fixed
|
NormalTextIdfSumFixed
web_production: 360
Previous factors - fixed
|
LRAmortizedByAge
web_production: 363
Weight: 0.003128580544172 Link relevance with pessimization for great age Link
|
RusWordsInText
web_production: 364
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
RusWordsInTitle
web_production: 365
Weight: 0.03118624384934 The number of words of the Russian language in the title
|
MeanWordLength
web_production: 366
Weight: 0.019580616053835 The average length of the word
|
PercentWordsInLinks
web_production: 367
Weight: 0.057053549836014 The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
PercentVisibleContent
web_production: 368
Weight: -0.032828345615772 The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
PercentFreqWords
web_production: 369
Weight: -0.020210221137273 The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
PercentUsedFreqWords
web_production: 370
Weight: -0.063976585802142 The number used in the text 500 of the most popular words of the language, divided by 500
|
TrigramsProb
web_production: 371
Weight: -0.002170850269151 Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
TrigramsCondProb
web_production: 372
Weight: 0.026650508120317 Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
UrlBM25
web_production: 377
Weight: 0.066890922161289 BM25 on URL'U
|
HasBigPicture
web_production: 378
The page has a big picture
|
DaterAge
web_production: 380
Weight: -0.207437366708906 The difference between the current date and the date of the document defined by the dates, 1 - the date of the document is equal to the current, 0 - the document of 10 years or more, if the date is not defined, equal to 0. Attention! ((1 - dateraage)*60)^2 = age of the page In days.
|
CInDegree1
web_production: 382
The host factors determine the sites screwed by the links-the second and third incoming degrees ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/antispam?v=181rh58958953
|
CInDegree2
web_production: 383
Weight: 0.000692523218694 The host factors determine the sites screwed by the links-the second and third incoming degrees ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/antispam?v=181rh58958953
|
NumNonRussianLinks
web_production: 384
The number of incoming links without Russian letters. Remembrance.
|
TextMaxForms
web_production: 385
Weight: -0.015212586791057 The maximum number of forms in all words of the request is max in all words of the request request_form_dl_lov/64
|
TextWeightedForms
web_production: 386
Weight: 0.022803839020796 The sum of the number of forms balanced by the scales of words - the amount in all words of the request of the number_form_dly_lov/64*weight_lov; REMAP species x/(1 + x).
|
TextForms
web_production: 387
Weight: -0.008656938143421 The unwarmed amount of the number of forms is the amount in all words of the request of the number_form_dl_lov/64/number_lov_
|
LinkMaxForms
web_production: 388
The maximum number of forms in all words of the request
|
LinkWeightedForms
web_production: 389
Weight: 0.096811143316269 Summer of the number of forms balanced by scales
|
LinkForms
web_production: 390
Undested amount of the number of forms
|
TR_W1
web_production: 391
Analogues of the factors of the same name, the weight of the word = 1
|
XLR_W1
web_production: 392
Analogues of the factors of the same name, the weight of the word = 1
|
TextBM25_Fm_W1
web_production: 393
Analogues of the factors of the same name, the weight of the word = 1
|
TextBM25_Sy_W1
web_production: 394
Analogues of the factors of the same name, the weight of the word = 1
|
LinkBM25_W1
web_production: 395
Analogues of the factors of the same name, the weight of the word = 1
|
TLBM25_W1
web_production: 396
Analogues of the factors of the same name, the weight of the word = 1
|
NumeralsPortion
web_production: 399
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
ParticlesPortion
web_production: 400
Weight: -0.012429221647235 The share of particles
|
AdjPronounsPortion
web_production: 401
Weight: -0.005976754416269 The share of pronoun adjectives
|
AdvPronounsPortion
web_production: 402
Weight: -0.001250755074786 The proportion of pronoun nouns
|
VerbsPortion
web_production: 403
The share of verbs
|
FemAndMasNounsPortion
web_production: 404
Weight: 0.011650367441796 The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'hummingbirds' are an example of an indefinite kind that can be determined in two ways, 'Alexander' is homonymy).
|
LinkQualityFixed
web_production: 405
Weight: 0.013112575551553 Quality of incoming links (hauser classifier) corrected
|
HasLinkQualityFixed
web_production: 406
Considered LinkQuality for this page or not (did not think, if there are few links) corrected
|
NewLinkQualityFixed
web_production: 407
Weight: 0.021178675054476 Quality classifier of incoming links 2 corrected
|
IsOrg
web_production: 408
Weight: -0.018278527670779 The request is the name of the organization (example: Gazprom, Gazprom) ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees Description))
|
SmartUkrainian
web_production: 411
|
SmartBelorussian
web_production: 412
|
LRWithoutRare
web_production: 413
Weight: -0.011221458184058 Link relevance without taking into account rare words
|
DifferentInternalLinks
web_production: 414
Weight: 0.096447224363928 The number of different internal links to the page
|
HasDeterminedCities
web_production: 415
Weight: 0.165031403865939 The city is defined for the site
|
GeoRegionalityUNew
web_production: 416
Requestful factors - the result of the work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosya classifier of the request of the request)) - a new version of factors [328] - [328] - [328]: u - u - u - u - u - u - uceleless sites the request is meaningless;
|
GeoRegionalityRNew
web_production: 417
Запросные факторы - результат работы ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy классификатора геолокализованности запроса)) - новая версия факторов [328]-[330]: R - георелевантные - региональные результаты в issuing could be useful, but nothing more;
|
GeoRegionalityVNew
web_production: 418
Requestful factors - the result of work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosya classifier of the request of the request)) - a new version of factors [328]: Vegetable fundamental importance.
|
UkrainPageRank
web_production: 420
Weight: 0.087122791007993 Ukrainian Page Rank
|
QClassDownload
web_production: 421
= 1 - v. Download formula. Class requests: download/watch online/play/photo/listen
|
QClassBrandnames
web_production: 422
The result of the classifier of the request - in the request there are words from the corresponding dictionary. brand
|
QClassDisease
web_production: 423
Medication Dictionary
|
QClassKak
web_production: 424
question
|
QClassMoscow
web_production: 425
Specific request for Moscow
|
QClassOAO
web_production: 426
Weight: -0.005085205304656 organization
|
QClassPorno
web_production: 427
porn
|
QClassTravel
web_production: 428
trips
|
PeriodicLinkDatesPercent
web_production: 430
Weight: 0.013900531929943 The frequency of links to the site
|
LinkAlmostPeriod
web_production: 431
The number of almost-periodic links
|
HasLiRuCounter
web_production: 434
The presence of a LiveInternet meter
|
DssmYaMusicEarlyBindingCe
web_production: 438
DSSM model with early binding, trained on reforming and learned on musical requests for Alice
|
SecondIndegDistrXi
web_production: 439
Weight: -0.01085051113308 Eleven factors based on the statistical properties of the distributions of the incoming degrees of peaks that refer to the fixed top of the hostographer. ((Http://wiki.yandex-team.ru/jandekpoisk/kachestvopoiska/obshayaformula/tekushhiekmponenty/HostdDEGRE)
|
RcSpylogUrlRationalSigmoidD1T240
web_production: 446
URL feature computed from rapid clicks spy_log counters with decay of 1 day
|
RcSpylogUrlRationalSigmoidD1T240Frozen
web_production: 447
URL feature computed from rapid clicks spy_log counters with decay of 1 day
|
RcSpylogUrlRationalSigmoidD0_5T30
web_production: 448
URL feature computed from rapid clicks spy_log counters with decay of 0.5 days
|
RcSpylogUrlRationalSigmoidD0_5T30Frozen
web_production: 449
URL feature computed from rapid clicks spy_log counters with decay of 0.5 day
|
TxtPair_W1
web_production: 454
Weight: -0.016932610010322 Simple BM25 in pairs of words - we take all pairs of words of the request and consider the number of their entry into the text of the document. Weight = 1. It does not work if there is a stop-word in the request
|
AuraDocLogShared
web_production: 455
Weight: -0.097686304848915 Logarithm of the number of shingles on which this document is not unique
|
AuraDocLogAuthor
web_production: 456
Weight: -0.097277529611975 Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
AuraDocMeanSharedWeight
web_production: 457
Weight: -0.110593487056685 The average weight of non-ugly shingles of this document
|
RegHostRank
web_production: 467
Weight: 0.156712439907419 It reads in the same way as the Hostrank factor, but not on all the Owner graph, but on its subrack, consisting of Owner's in this region. Belonging to the region is determined by TLD, or by the presence of pages with this Owner in the index, about which the GEO or Geoa classifier says that they are from this region. Mapped in the same way as the Hostrank factor, from 0 to 1 with 256 gradations
|
RegIsWiki
web_production: 468
A document from the language section of Wikipedia corresponding to the user region
|
LanguageCompliance
web_production: 469
Weight: 0.054576897612176 The language of the document corresponds to the language language
|
NationalDomain
web_production: 476
The country of the document (domain) and the country of the user coincide ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaFormula/tekushhiekomponijafaktorov#national
|
RcSpylogUrlRationalSigmoidD3T120
web_production: 478
URL feature computed from rapid clicks spy_log counters with decay of 3 days
|
CountryQueryRegionality
web_production: 479
Weight: 0.012081787040108 Country classifier of localization - how much the request implies the context of the country
|
NumSlashes
web_production: 480
Weight: 0.050576094170344 The number of slashes in Url
|
WatchVideo
web_production: 482
The presence of a built -in video player on the page
|
DownloadVideo
web_production: 483
Video for downloading
|
RcSpylogUrlRationalSigmoidD3T120Frozen
web_production: 484
URL feature computed from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogUrlRationalSigmoidD14T300
web_production: 485
URL feature computed from rapid clicks spy_log counters with decay of 14 days
|
GskUrlModel
web_production: 487
Weight: 0.013412340418363 The factor is calculated from the text of Url using the classifier of sequences Quality/Seq/GSK
|
RcSpylogUrlRationalSigmoidD14T300Frozen
web_production: 489
URL feature computed from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogAge
web_production: 490
Age of rapid clicks spy_log update, in seconds
|
RcSpylogFreshness
web_production: 491
Freshness of rapid clicks spy_log update
|
Bclm
web_production: 493
Weight: 0.030786458206337 Buettcher, Clarke and Lushman factor (modified) ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushichiekomponenty/bclm more)))))))))
|
FieldLM
web_production: 495
Weight: 1.36522746e-7 Unigramal language model. Language is modeling according to the document, smoothed out by the general linguistic model. When building a model, the document uses information on which field of the document met the word request (Title, Head or Plain Text)
|
GeoCityUrlRegionCity
web_production: 496
The coincidence of geography, determined from the Url of the document and the city of the request (IP or LR)
|
GeoCityUrlRegionRegion
web_production: 497
The coincidence of geography, determined from the Url of the Document and the Request region (IP or LR)
|
GeoCityUrlRegionCountry
web_production: 498
Weight: -0.168645758020604 The coincidence of geography, determined from the Url of the document and the country of request (IP or LR). Actual for Russia and Ukraine.
|
GeoCityUrlGeoCityCity
web_production: 499
The coincidence of geography, determined from Ural Documents and the City in the request (GEOCITY rule)
|
TitleTrigramsQuery
web_production: 501
Weight: 0.112928770384249 Calculates the coating of the request with letter trigrams of the document header
|
TitleTrigramsTitle
web_production: 502
Calculates the heading of the heading of the document header with letter trigrams
|
InlinksModel
web_production: 503
Probabilistic model built on the texts of incoming links
|
QueryWordSequencesTR
web_production: 504
Weight: -0.11860635115951 He considers the sum of the following species: the sequence of words of the request more than two, met in one sentence; It is normalized for the length of the document.
|
QueryWordSequencesLR
web_production: 505
He considers the sum of the following species: the sequence of words of the request more than two, met in one link; It is normalized to the number of links.
|
GeoRelevAlienCity
web_production: 507
Weight: 0.084699401575226 The result has a geography of the user at the city level ([415] == 1 && [215] == 0)
|
GeoVQueryInUserCity
web_production: 508
Request geovitality for results from the user region
|
GeoVQueryInAlienCity
web_production: 509
Request geovitality for the results is not from the user region
|
HostReliability
web_production: 510
Weight: -0.045942748393758 The share of the Urlov that respond without errors
|
DmozThemeMatchAll
web_production: 511
Coincidence of the thematic spectrum (according to DMOZ) request and document. The theme of the request is determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer Dmoztheme))
|
DmozThemeMatchBest
web_production: 512
Coincidence of the thematic spectrum (according to DMOZ) request and document. The theme of the request is determined by the best result ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 Rules for the sorcerer DmozTheme)) The subject of the document is determined by the automatic classifier
|
Mpsa
web_production: 513
Weight: 0.093045433292429 Evaluates the minimum distance between the pairs of words of the request, taking into account the remoteness of the pair from the beginning of the document (Minimal Pair Size with Attenuation). Steles are understood to mean all consistent bigrams of the words of the request. Thus, the number of vapor is equal to the number of words in a request reduced by 1. Accordingly, the factor makes sense for requests consisting of more than one word. (Http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/ Tekushhiekomponenty/MPSA MPSA))
|
Bclm2
web_production: 514
It differs from BCLM in that the weights of all words are considered the same. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/bclm2 BCLM2))))))))))))
|
AbsolutePLM
web_production: 515
Text relevant based on the language model, taking into account the absolute position. We go along the text with a window of 20 words, build a language model on each window (that is, the distribution of probabilities in the words of the Russian language) and calculate the probability of generating a request. For removal from the beginning of the document, we finish the model.
|
PageRegionCoverage
web_production: 516
Weight: -0.063761467432684 |
PageRegionSize
web_production: 517
Weight: -0.030877746812643 The size of the page of the page
|
PageRegionRelCoverage
web_production: 518
Weight: -0.000832706989751 |
RcSpylogFreshnessAtReq
web_production: 519
Freshness of rapid clicks spy_log update, calculated at the request time
|
IsGeo
web_production: 520
Weight: -0.027287688639737 It launches on the basic search under the name ISGEO the maximum weight of the meters of the gelator in the request. A geo-object is understood as an object of the category GEO, Geo1, Geoaddr, Geoaddr1, Landmark, Landmark1 (see ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects kaovsky allocation))))))))))))))))))))))))))))))). wiki.yandex-team.ru/arsengadzhikurbanov/wares Read more))
|
IsMusic
web_production: 521
It launches for the basic search under the name ISMUSIC the maximum weight of the Music or Music1 category of the category of the Category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees more)))))))))))))))))))))
|
BclmLite
web_production: 522
Modification of the BCLM2 factor, lightweight for use in tulle. The main difference is that BCLMLite does not use absolute displacements of words relative to the beginning of the document. Instead, the factor works with the usual positions of the type <number of the_prising, position_v_production>. At the same time, the proximity between the words is taken into account only inside the sentence. (Http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaFormula/tekushichiekomponenty/bclmlite bclmlite)))))))))))))
|
NearbyQuery
web_production: 523
When responding to a request, the results are important in close proximity ([pharmacies], [children's clinic])
|
CityQuery
web_production: 524
Weight: -0.091993052812036 When answering a request, the results within the city are important (the bulk of localized queries)
|
AdmQuery
web_production: 525
When responding to a request, the results from the region of the user ([airport], [dairy]) are important
|
NumLinksFromMP
web_production: 526
The number of incoming muzzle links
|
YmwFull2
web_production: 527
Weight: -0.044940112806396 Fixed YMWFull. It differs from the previous version only by behavior on 2 -word queries. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/ymw Read more))
|
FullQuorum
web_production: 528
Binary factor, every word of the request is in the text or in the links
|
Soft404
web_production: 531
Page - '404' (share of tokens '404' in relation to the total number of tokens on the page)
|
RcSpylogUrlRationalSigmoidD1T240AtReq
web_production: 532
URL feature computed at the request time from rapid clicks spy_log counters with decay of 1 day
|
DBM25
web_production: 533
BM25, in which the weight of the word is machine -like
|
QueryWordCohesionTR
web_production: 534
Weight: -0.053739168786067 The factor evaluates as the words of the request is grouped with each other in the text of the document without taking into account their order. ((http://wiki.yandex-team.ru/sergejjkrylov/queryWordCohesionTR Description))
|
RcSpylogUrlRationalSigmoidD0_5T30AtReq
web_production: 536
URL feature computed at the request time from rapid clicks spy_log counters with decay of 0.5 days
|
QueryDOwnerSessNormDuration_Reg
web_production: 537
CONTRY / K
|
QueryDOwnerWeightClick_Reg
web_production: 538
Weight: 0.115262514353577 w/k
|
QueryDOwnerOnlyClickRate_Reg
web_production: 539
Weight: 0.179216994410993 o/i
|
SegmentAuxAlphasInText
web_production: 542
Weight: 0.010581678208134 Number of letters in the AUX segment
|
SegmentAuxSpacesInText
web_production: 543
Weight: -0.011681967583253 The number of spaces in the AUX segment
|
SegmentContentCommasInText
web_production: 544
The number of commas in the Content segment
|
XLRGeoRelevRegionNatDomain
web_production: 546
Weight: 0.013370500669584 |
QueryRefTrigramQuery
web_production: 549
Weight: 0.054926147793071 ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/opisanijafaktorov#queryreftrigrams Description))))))))))))))))))
|
QueryRefTrigramReferences
web_production: 550
Weight: -0.096496414873675 ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/opisanijafaktorov#queryreftrigrams Description))))))))))))))))))
|
IdfVariance
web_production: 551
Weight: 0.025691573951246 Dispersion of IDF words,
|
UrlNGramsModel
web_production: 552
Weight: 0.055185094441888 Urlngramsmodel ranking factor in ERF
|
NationalLanguage
web_production: 553
The language of the document corresponds to the country's request
|
GeoCountryUrlRegionCountry
web_production: 555
|
GeoCountryUrlGeoCountry
web_production: 556
|
NumLinksFromSegmentContent
web_production: 557
Weight: 0.094045741102708 |
Locm
web_production: 558
Weight: -0.070483297609751 The order of words in exiles.
|
FiltrationSegments
web_production: 561
The share of the segments of the request present in the text
|
LanguageGoodForTurkey
web_production: 562
The language of the document is one of the permissible for Turkey (Turkish, English, German, French, Arabic, Azerbaijani) or the document has zero length. In the search stage is calculated only for Isrealgeolocal requests.
|
DBM25_2
web_production: 563
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
GeoDispersion
web_production: 564
Document links dispersion
|
BM25FdPRFixed
web_production: 566
Weight: 0.058870258158539 BM25FDPR with standardization on the average length of the document, depending on the language of the document. ((http://wiki.yandex-team.ru/bm25frework test results.))
|
LanguagePopularity
web_production: 567
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
RcSpylogUrlRationalSigmoidD3T120AtReq
web_production: 570
URL feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogUrlRationalSigmoidD14T300AtReq
web_production: 571
URL feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
|
Tocm
web_production: 572
Weight: -0.005028751679547 The factor evaluates the differences in the positions of words in the heading from the posterity in the request
|
RelevGeoLinksPercent
web_production: 573
Weight: -0.069803680024687 |
LangDispersion
web_production: 574
Dispersion of languages in XMAP
|
HasMisspell
web_production: 575
There is a typo in the request
|
DBM30Smerch
web_production: 576
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
UrlLinkPercent
web_production: 578
Weight: 0.089404211238337 The ratio of the number of incoming links, the text of which is the URL, is one of the incoming links
|
NumNonLettersInUrl
web_production: 580
Weight: -0.011207582653854 The number of 'Nebukv 'in Url
|
IsHub
web_production: 582
Weight: 0.097073501164592 Habi page
|
StaticTitleBM25Ex
web_production: 584
Weight: 0.016179974819787 BM25 page title by its text
|
StaticTitleLRBM25
web_production: 585
Weight: 0.038263040612831 BM25 page title by texts of links to it
|
SeoInPayLinks
web_production: 586
Weight: -0.028595315195293 The number of COO-Thrilling links between hosts
|
TitleInLinksTrigrams
web_production: 597
Weight: -0.076334972364641 The share of unique trigrams in the trigrams of links
|
LinksInTitleTrigrams
web_production: 598
Weight: 0.019301158836494 Share of unique trigrams of links in trigrams header
|
TrashAdv
web_production: 599
The greasy of the page
|
UrlGeoAdms
web_production: 601
The URL document corresponds to the user (http://wiki.yandex-team.ru/jandekspoisk/kacheStvopoiska/geo/regnavquerispoisk/KacheStvopoiska/GEO/RENAVAVQURIES)
|
RegNavQuery
web_production: 603
Regional and navigation request - in the user region there are one or more navigation results on it
|
SOMaxSumSourceRank
web_production: 605
Weight: 0.061675217167197 The sum of the maximum values of Sourcerank's for each incoming link, taking into account the uniqueness of the owner.
|
DBM35
web_production: 606
Weight: 0.046757967567051 BM25 in texts and links with special. Libra in the level of coincidence (shape, lemma, synonym)
|
TRLRQuorumFm
web_production: 607
Weight: -0.062810308974889 The weight of the words of the request that is in the text in the exact form
|
TRLRQuorumLemma
web_production: 608
Weight: -0.003021983245146 The weight of the words of the request that is in the text with an accuracy to lemma
|
TRLRQuorumSyn
web_production: 609
The weight of the words of the request that is in the text
|
IsHum
web_production: 610
Weight: 0.003622338166697 It launches on the basic search under the name ISHUM the maximum weight of the enclosed object of the Hum or Hum1 category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ishum more)))))
|
IsText
web_production: 611
It launches on the basic search under the name ISTEXT the maximum weight of the TEXT or Text1 category of the category of the category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#istext more)))
|
IsPicture
web_production: 612
It launches on the basic search under the name Ispicture the maximum weight of the Picture or Picture1 category of the category of the category of the category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ispicture))))))))))))))))))
|
MaxOne
web_production: 613
Weight: -0.059871381556405 Returns the maximum degree of household objects in the request under the name Wmaxone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#maxone more)))))))
|
MinOne
web_production: 614
Weight: 0.113671587879567 Returns the maximum degree of household objects in the request under the name Wminone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#minone more)))))
|
DomPhraseClickRankBi
web_production: 627
Weight: 0.209866937086235 Clicking domain on biograms (excluding thesaurus extensions of requests)
|
DomPhraseYabarBi
web_production: 628
Weight: 0.20518490511548 Transitions to the site from search engines by biograms, according to the bar (excluding thesaurus extensions of requests)
|
LastWordHostClicks
web_production: 629
Weight: 0.06275358178297 The clickableness of the host according to the latest request (excluding thesaurus extensions of requests)
|
RcSearchBaseUrlRationalSigmoidD1TM600AtReq
web_production: 635
URL feature computed at the request time from rapid clicks search counters with decay of 1 day
|
RcSearchBaseUrlContrastD30Odd0_9_X_D30T1AtReq
web_production: 638
URL feature computed at the request time from rapid clicks search counters with decay of 30 days
|
DmozQueryBestTheme
web_production: 640
Weight: -0.000807198317231 The most likely theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer DmozTheme)), only the most popular topics are taken into account (but there are more than in the DMOZQUREMES factor). The factor contains the likelihood of a correspondence of the request of the theme, but for each topic, its own interval is taken on the segment [0..1]
|
DmozQueryThemes
web_production: 641
The theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer Dmoztheme)), only a few of the most popular topics are taken into account.
|
DiversityCategNeedPhoto
web_production: 642
0 or 1, depending on the presence in the request of the clearly expressed intent Need_photo from the variety
|
DiversityCategNeedMap
web_production: 643
0 or 1, depending on the presence in the request of the clearly expressed intent Need_map from the variety
|
LongQuerySyn
web_production: 644
Weight: 0.058415162135787 The factor is an analogue of LongQuery (the sum of the IDF words of the request), but with the 'correct' accounting of synonyms. Specifically, a minimum of IDF (i.e. the most frequent) of synonyms and words is selected.
|
TurkeyPageRank
web_production: 646
Personalized Turkish Pagerank
|
ExpectedFound
web_production: 647
Expected number of found on request
|
FooterInLinksTrigrams
web_production: 648
The share of unique trigrams of a footer fragment in trigrams of links
|
LinksInFooterTrigrams
web_production: 649
The share of unique trigrams of links among a fragment of trigrams of a footer
|
ErratumLogQueryProbability
web_production: 650
Double logarithm of the probability of a request for a language model of the Erratum typo service
|
DBM40
web_production: 652
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
BM25_0
web_production: 654
Variation on the topic BM25
|
BM25_1
web_production: 655
Variation on the topic BM25
|
BM25_0123
web_production: 656
Variation on the topic BM25
|
DBMNumbers
web_production: 662
DBM separately by numbers
|
DBMGeo
web_production: 663
DBM separately by geo-objects of request
|
DBMSubstantive
web_production: 664
DBM separately on the noun
|
Bocm
web_production: 668
Evals the correspondence of the positions of words in the sentences of the document to the positions of words in the request.
|
IsIndexPage
web_production: 671
This is Index. (HTML/PHP/ASPX?/...), without CGI parameters. It is considered to be for all takes.
|
IsIndexPageSoft
web_production: 672
This is Index. (HTML/PHP/ASPX?/...), possibly with CGI parameters. It is considered to be for all takes.
|
IsOwner
web_production: 673
Whether the host is the owner, conditionally host == Owner (Host).
|
MinPathLen
web_production: 674
The minimum length of Pathandquery for all half -shoes.
|
XLerfGeoLRlogRelevCnt
web_production: 675
Regionalized (only links from the country of request are taken) variant of the Xlerfgeolrlogrelev factor
|
XNonCommLerfNormLRlogRelevCnt
web_production: 676
Regionalized (only links from the country of request are taken) variant of the factor XNONCOMMLERFNORMLRLOGRELAV
|
LocmCnt
web_production: 677
Regionalized (only links from the country of request are taken) Variant of Locm factor
|
XLRrelevCnt
web_production: 678
Regionalized (only links from the country of request are taken) variant of factor xlrrelev
|
XLerfLRrelev200Cnt
web_production: 679
Regionalized (only links from the country of request are taken) variant of factor Xlerflrrelev200
|
RankComGoodness
web_production: 681
Classifier for estimates of commercial sites
|
HasDownloadLinkOnFile
web_production: 682
The document has a direct link to the file
|
HasDownloadLinkOnFileHosting
web_production: 683
The document has a link to filehosting
|
DiversityCategDownload
web_production: 684
0 or 1 - whether the request is matured by the tickt
|
DiversityCategReview
web_production: 685
0 or 1 - whether the request is matured by the tickt
|
DiversityCategWatch
web_production: 686
0 or 1 - whether the request is matured by the tickt
|
QrTur
web_production: 687
The prediction of the share of “good” (at least two different cities and frequency> = 10) references to the request with geography in Turkey
|
QueryThEncyclopedic
web_production: 688
The result of the work of the lexical classifier of requests predicting the likelihood of click on the theme of 3561
|
QueryThVideohosting
web_production: 689
The result of the work of the lexical classifier of requests predicting the likelihood of click on the page 3973 page
|
IsNavMxQuery
web_production: 690
Rank 'navigation'
|
ClickedWithAnotherSEClicks
web_production: 692
Clicks on the urlahs shown in the issuance for requests, by which they went to look for other search engines
|
ShowsWithAnotherSEClicks
web_production: 693
Urlov shows in the issuance for requests, by which they went to look for other search engines
|
CommercialOwnerRank_Reg
web_production: 694
Classifier of the commerciality of the site
|
BclmMax
web_production: 696
The proximity of the words of the request to the most difficult word.
|
HasUserReviews
web_production: 698
The document contains user review/comment
|
DBM15Wares
web_production: 703
|
RankComGoodnessBar
web_production: 704
Classifier that approximate the quality of commercial sites based on user behavior data
|
DocCreateMonth
web_production: 705
The time of creating a document with an accuracy of 1.0 is the current month, 0- 10 years ago and older. Temporarily disconnected
|
DocUpdateMonth
web_production: 706
The time for updating the document with an accuracy of 1.0 is the current month, 0- 10 years ago and older. Temporarily disconnected
|
XLRSourceRank
web_production: 707
|
XLRMainPage
web_production: 708
|
DaterStatsYearNormLikelihood
web_production: 709
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
LcmVar
web_production: 711
Dispersion of the number of words in the links.
|
DaterStatsAverageSourceSegment
web_production: 712
The arithmetic mean position of dates in the document. Temporarily disconnected
|
DBM15Wares2
web_production: 713
|
Cabm
web_production: 714
BM with attenuation in the text of catalog links.
|
BeastNqUrlMeanPos
web_production: 715
The average position of Urla for a normalized request
|
BeastNqOwnerMeanPos
web_production: 716
The average position of Domattr for a normalized request
|
BeastUrlMeanPos
web_production: 717
The average position of Urla for all requests
|
BeastHostMeanPos
web_production: 718
The average position of the host for all requests
|
BeastUrlNumQueries
web_production: 719
Number of requests for URL
|
BeastHostNumQueries
web_production: 720
Number of requests for host
|
SegmentWordPortionFromMainContent
web_production: 723
The share of the words of the document from the segments with Score> 2.
|
UrlDomainSimilarityFixed
web_production: 724
|
TotalDups
web_production: 725
|
RankBoostGoodness
web_production: 726
The rank of site quality used for boosts of the Moscow commercial formula
|
LanguageDistribution
web_production: 729
|
UrlShowsWithNextPageClicksP1
web_production: 730
|
UrlShowsWithNextPageClicksP10
web_production: 731
The factor is used in Selectionrank. TG_UNUSED: should not be included in the formulas to avoid feedback
|
SmallWindowAttenuation
web_production: 734
|
RcSearchBaseUrlRationalSigmoidD3T120AtReq
web_production: 735
URL feature computed at the request time from rapid clicks search counters with decay of 3 days
|
CommRus
web_production: 737
The weight of the document on a monosyllabic dictionary of commercial vocabulary
|
WikiLinkCount
web_production: 738
|
UkrIsQueryLang
web_production: 741
Shows that a request in Ukrainian
|
QueriesAvgCM2
web_production: 742
Average query commerciality
|
RcSearchBaseUrlRationalSigmoidD1TM600Frozen
web_production: 746
URL feature computed from rapid clicks search frozen counters with decay of 1 day
|
NastyContent
web_production: 755
Content ugliness factor.
|
UrlQueryTrigramsStatic
web_production: 758
Static trigrams intercection of url and queries by which users visited the url.
|
AdvAspam
web_production: 759
|
HasPornoQuery
web_production: 760
The result of the work of Adult Rules for the Sorcerer.
|
RcSearchBaseUrlContrastD30Odd0_9_X_D30T1Frozen
web_production: 768
URL feature computed from rapid clicks search frozen counters with decay of 30 days
|
AuxTitleBM25
web_production: 770
TEXTBM25 is considered in the title by the text of the name of the user region - similar to the factor 268.
|
PopularSEFRCBrowser
web_production: 773
FRC Popular Search System for Browser Logs
|
RcSearchBaseUrlRationalSigmoidD3T120Frozen
web_production: 779
URL feature computed from rapid clicks search frozen counters with decay of 3 days
|
GeoRelevRegionCityGeoa
web_production: 782
Factor Gorelevregions of the 1th Attichut and Geoa
|
GeoRelevRegionRegionGeoa
web_production: 783
Factor GorelevregionRegionRegion Natthew GEOA
|
GeoGeometryProximGeoa
web_production: 784
Factor Geogeetryproxim ▪ Attributu GEOA
|
GeoRelevAlienCityGeoa
web_production: 785
Factor Gorelevaliencity n Att. Att. Attibtu Geoa
|
GeoVQueryInUserCityGeoa
web_production: 786
Factor Geovqueryinusercidence n Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Att. Attfruut and Geoa
|
GeoVQueryInAlienCityGeoa
web_production: 787
Geovquery Geovqueryinieniencity n Att. Att. Attib
|
PageRegionSizeGeo
web_production: 788
PageRegionsize Factor by GEO attribute
|
PageRegionCoverageGeo
web_production: 789
PageRegioncoverage Factor GEO attribute
|
PageRegionCoverageAdresa
web_production: 790
PageRegioncoverage Factor on Adresa attribute
|
GeoRelevRegionCityAdresa
web_production: 791
GeorelevregionCity Factor on Adresa attribute
|
RcSpylogHostRationalSigmoidD3T0AtReq
web_production: 800
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidD3DTM3600AtReq
web_production: 801
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidD14T0AtReq
web_production: 802
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogHostRationalSigmoidD14DTM3600AtReq
web_production: 803
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogHostRationalSigmoidedCTRD3DT0TM3600AtReq
web_production: 804
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidedCTRD14DT0TM3600AtReq
web_production: 805
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogHostRationalSigmoidD3T0Frozen
web_production: 806
Host feature computed from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidD3DTM3600Frozen
web_production: 807
Host feature computed from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidD14T0Frozen
web_production: 808
Host feature computed from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogHostRationalSigmoidD14DTM3600Frozen
web_production: 809
Host feature computed from rapid clicks spy_log counters with decay of 14 days
|
RcSpylogHostRationalSigmoidedCTRD3DT0TM3600Frozen
web_production: 810
Host feature computed from rapid clicks spy_log counters with decay of 3 days
|
RcSpylogHostRationalSigmoidedCTRD14DT0TM3600Frozen
web_production: 811
Host feature computed from rapid clicks spy_log counters with decay of 14 days
|
FioFromOriginalRequestBodyChain0Wcm
web_production: 820
The factor according to the name from the original request is considered according to the contents of the document. Algorithm: Chain0wcm
|
FioFromOriginalRequestBodyMinWindowSize
web_production: 873
The factor according to the name from the original request is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
FioFromOriginalRequestTextCosineMatchMaxPrediction
web_production: 874
Factor for name from the original request text of the document. Algorithm Cosinematchmaxpredical.
|
AllFioFromOriginalRequestAllMaxFBodyChain0Wcm
web_production: 875
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. Algorithm: Chain0wcm
|
AllFioFromOriginalRequestAllMaxFBodyMinWindowSize
web_production: 876
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
AllFioFromOriginalRequestAllMaxFTextCosineMatchMaxPrediction
web_production: 882
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; The text of the document. Algorithm Cosinematchmaxpredical.
|
AliceClickDssm
web_production: 900
DSSM CLOSE DISCOUNT according to data specific for Alice
|
TelFullAttributeTextBocm15K001
web_production: 901
The factor for telephone attributes Tel_Full from the original request text of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
AliceTimespentSuffixSum
web_production: 957
The prediction of the total time spent to the end of the session, provided that this pair is implemented by the request-document
|
AliceTimespent
web_production: 958
The prediction of the contribution of this pair request-document to the timetable
|
AliceMaxPercentPlayed
web_production: 965
The prediction of the percentage of the length of the track, which will be lost subject to the implementation of this pair of the request
|
AliceTimespentSum
web_production: 1273
Prediction of the time of the session, provided that this pair is requested by the request-document
|
OriginalRequestUrlCosineMatchMaxPrediction
web_production: 1279
The factor for the original request. It is considered to be toxicated by Ural. Algorithm Cosinematchmaxpredical.
|
OriginalRequestUrlAttenV1Bm15K05
web_production: 1280
The factor for the original request. It is considered to be toxicated by Ural. The weight of the hit is multiplied by 1/ (1 + the position of the word in the sentence) an algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.5.
|
OriginalRequestTitleBclmMixPlainKE5
web_production: 1281
The factor for the original request. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
OriginalRequestTitleCMMatchTop5AvgMatchValue
web_production: 1282
The factor for the original request. It is considered according to the heading of the document. CMMATCHTOP5AVGMATCHVALUE algorithm.
|
OriginalRequestTitleWordCoverageForm
web_production: 1283
The factor for the original request. It is considered according to the heading of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
OriginalRequestTitleAttenV1Bm15K05
web_production: 1284
The factor for the original request. It is considered according to the heading of the document. The weight of the hit is multiplied by 1/ (1 + the position of the word in the sentence) an algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.5.
|
OriginalRequestBodyBclmMixPlainKE5
web_production: 1285
The factor for the original request. It is considered according to the contents of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
OriginalRequestBodyCosineMatchMaxPrediction
web_production: 1286
The factor for the original request. It is considered according to the contents of the document. Algorithm Cosinematchmaxpredical.
|
OriginalRequestBodyAllWcmWeightedPrediction
web_production: 1287
The factor for the original request. It is considered according to the contents of the document. Algorithm Allwcmweightedpredical.
|
OriginalRequestBodyBocm15K001
web_production: 1288
The factor for the original request. It is considered according to the contents of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
OriginalRequestBodyQueryPartMatchSumValueAny
web_production: 1289
The factor for the original request. It is considered according to the contents of the document. Algorithm: Querypartmatchsumvalueany.
|
OriginalRequestBodyWordCoverageForm
web_production: 1290
The factor for the original request. It is considered according to the contents of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
OriginalRequestBodyWordCoverageExact
web_production: 1291
The factor for the original request. It is considered according to the contents of the document. The degree of covering the words of the request in the exact form.
|
OriginalRequestBodyBm15MaxAnnotationK001
web_production: 1292
The factor for the original request. It is considered according to the contents of the document. Libra Agnregation algorithm: BM15Maxannotation normalization coefficient 0.01.
|
AllMatchedWordWeightsSum
web_production: 1407
The normalized amount of the scales of the words of the request that met in the text of the document or links to it.
|
StringMatchedWordWeightsSum
web_production: 1408
The normalized amount of the scales of the words of the request that Equal_by_String in the text of the document or links to it.
|
AllMatchedWordWeightsSumText
web_production: 1409
The normalized amount of the scales of the words of the request that met in the text of the document.
|
AllMatchedWordWeightsSumLink
web_production: 1410
The normalized amount of the scales of the words of the request that met in the links to the document.
|
StringMatchedWordWeightsSumLink
web_production: 1411
The normalized amount of the scales of the words of the request that Equal_by_String in the links to the document.
|
AllMatchedWordFiltrationModelWeightsSum
web_production: 1412
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in the text of the document or links to it.
|
StringMatchedWordFiltrationModelWeightsSum
web_production: 1413
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which are Equal_by_String in the text of the document or links to it.
|
LemmaMatchedWordFiltrationModelWeightsSum
web_production: 1414
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_lemma in the text of the document or links to it.
|
AllMatchedWordFiltrationModelWeightsSumLink
web_production: 1415
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in links to the document.
|
StringMatchedWordFiltrationModelWeightsSumLink
web_production: 1416
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_String in the links to the document.
|
NoApproxSmallWindowAttenuation
web_production: 1470
|
DssmBoostingXfWeightQuerySelfSimilarity
web_production: 1477
Dssm Boosting query self similarity for XfWeight model.
|
DssmBoostingXfWeightKMeans5AvgTop02Score
web_production: 1478
Dssm Boosting AvgTop02Score aggregation for XfWeight model over 5-means centroids.
|
DssmBoostingXfWeightKMeans5AvgTop04Score
web_production: 1479
Dssm Boosting AvgTop04Score aggregation for XfWeight model over 5-means centroids.
|
DssmBoostingXfWeightKMeans5AvgTop02ScoreAvgClusterTop3Weighted
web_production: 1480
Dssm Boosting AvgTop02ScoreAvgClusterTop3Weighted aggregation for XfWeight model over 5-means centroids.
|
DssmBoostingXfWeightKMeans5AvgTop02ScoreQE
web_production: 1481
Dssm Boosting AvgTop02Score aggregation for XfWeight model over 5-means centroids (query as expansion).
|
DssmBoostingXfWeightKMeans5AvgTop02ScoreAvgClusterTop3WeightedQE
web_production: 1482
Dssm Boosting AvgTop02ScoreAvgClusterTop3Weighted aggregation for XfWeight model over 5-means centroids (query as expansion).
|
DssmBoostingXfOneQuerySelfSimilarity
web_production: 1483
Dssm Boosting query self similarity for XfOne model.
|
DssmBoostingXfOneKMeans1Score
web_production: 1484
Dssm Boosting Score aggregation for XfOne model over 1-means centroids.
|
DssmBoostingXfOneKMeans1ScaledSumWeight
web_production: 1485
Dssm Boosting ScaledSumWeight aggregation for XfOne model over 1-means centroids.
|
DssmBoostingXfOneKMeans1ScoreQE
web_production: 1486
Dssm Boosting Score aggregation for XfOne model over 1-means centroids (query as expansion).
|
DssmBoostingXfOneKMeans1ScoreAvgNearest1WeightedQE
web_production: 1487
Dssm Boosting ScoreAvgNearest1Weighted aggregation for XfOne model over 1-means centroids (query as expansion).
|
DssmBoostingXfOneKMeans1ScoreAvgNearest5WeightedQE
web_production: 1488
Dssm Boosting ScoreAvgNearest5Weighted aggregation for XfOne model over 1-means centroids (query as expansion).
|
DssmBoostingXfOneSeKMeans1Score
web_production: 1489
Dssm Boosting Score aggregation for XfOneSe model over 1-means centroids.
|
DssmBoostingXfOneSeKMeans1ScoreScaledSumWeighted
web_production: 1490
Dssm Boosting ScoreScaledSumWeighted aggregation for XfOneSe model over 1-means centroids.
|
DssmBoostingXfOneSeKMeans1ScoreAvgNearest5Weighted
web_production: 1491
Dssm Boosting ScoreAvgNearest5Weighted aggregation for XfOneSe model over 1-means centroids.
|
DssmBoostingCtrQuerySelfSimilarity
web_production: 1492
Dssm Boosting query self similarity for Ctr model.
|
DssmBoostingCtrKMeans1Score
web_production: 1493
Dssm Boosting Score aggregation for Ctr model over 1-means centroids.
|
DssmBoostingCtrKMeans1ScoreQE
web_production: 1494
Dssm Boosting Score aggregation for Ctr model over 1-means centroids (query as expansion).
|
DssmBoostingCtrKMeans1ScoreScaledSumWeightedQE
web_production: 1495
Dssm Boosting ScoreScaledSumWeighted aggregation for Ctr model over 1-means centroids (query as expansion).
|
DssmBoostingCtrKMeans1ScoreAvgNearest1WeightedQE
web_production: 1496
Dssm Boosting ScoreAvgNearest1Weighted aggregation for Ctr model over 1-means centroids (query as expansion).
|
RandomLogHostHasPaymentsAvg
web_production: 1524
AVG aggregation of HasPayments web factor using random log
|
RandomLogHostIsVideoQueryAvg
web_production: 1525
AVG aggregation of VideoQuery web factor using random log
|
RandomLogHostSyntQualityAvg
web_production: 1526
AVG aggregation of SyntQuality web factor using random log
|
RandomLogHostGeoRegionalityVNewPerc90
web_production: 1527
PERCENTALE_90 aggregation of GeoRegionalityVNew web factor using random log
|
RandomLogHostQClassDownloadAvg
web_production: 1528
AVG aggregation of QClassDownload web factor using random log
|
RandomLogHostIsMusicAvg
web_production: 1529
AVG aggregation of IsMusic web factor using random log
|
RandomLogHostQueryThEncyclopedicPerc25
web_production: 1530
PERCENTALE_25 aggregation of QueryThEncyclopedic web factor using random log
|
RandomLogHostCommercialOwnerRankRegAvg
web_production: 1531
AVG aggregation of CommercialOwnerRank_Reg web factor using random log
|
RandomLogHostYabarWordDNGIPerc25
web_production: 1532
PERCENTALE_25 aggregation of YabarWordDepthNodesGradientMin web factor using random log
|
RandomLogHostPopularSEFRCBrowserAvg
web_production: 1533
AVG aggregation of PopularSEFRCBrowser web factor using random log
|
RandomLogHostURLClicksMaxGeoRegionFRCRatioAvg
web_production: 1534
AVG aggregation of URLClicksMaxGeoRegionFRCRatio web factor using random log
|
RandomLogHostUBLongPeriodDirectHChildren90CntPerc90
web_production: 1535
PERCENTALE_90 aggregation of UBLongPeriodDirectHChildren90CntFromExtHost web factor using random log
|
RandomLogHostUBLongPeriodDtUrlHChildrenPerc90
web_production: 1536
PERCENTALE_90 aggregation of UBLongPeriodDtUrlHChildrenCut600Reg web factor using random log
|
RandomLogHostIsPictureAvg
web_production: 1537
AVG aggregation of IsPicture web factor using random log
|
RandomLogHostErratumLogQueryProbabilityAvg
web_production: 1538
AVG aggregation of ErratumLogQueryProbability web factor using random log
|
DssmQueryCountryToUrlEstimatedDistance
web_production: 1542
Predicted by demand and country, using a DSSM model, the length of the click from this country.
|
AliceMusicUrlTypeIsTrack
web_production: 1559
Type of canonized Urla Yandex Music - track
|
IsHttps
web_production: 1764
The document has a HTTPS protocol
|
QueryDoppMedianDwelltime
web_production: 1797
Median dwelltai request in history. Dwelltaym is cut to 6000. The request is normalized by doppelgangers
|
QueryDoppMultipleClicksShows
web_production: 1798
The number of shows of the request with more than one click in history. The request is normalized by doppelgangers
|
QueryDoppMultipleClicksProbability
web_production: 1799
The share of shows with more than one click from all shows in history. The request is normalized by doppelgangers
|
QueryDoppTimeFromPreviousPercentile25
web_production: 1844
25% time quantile from the previous request to the current one. The request is normalized by doppelgangers
|
NeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards
web_production: 1845
The result of the use of a neural model, trained to distinguish long clicks from other events, the input of the model is the ambassadors and bigram meters, calculated by text streams (Title, Body, URL).
|
QfufFilteredByXfOneSeAllMaxFFieldSet2Bm15FLogK0001
web_production: 1847
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. Into aircraft association of the URLs, Title, Body, Correctedctr, Longclick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeAllMaxFFieldSet3BclmWeightedFLogW0K0001
web_production: 1848
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. Rebelled association of streams Title, Body, LongClick, LongClicksp, OneClick. The algorithm for aggregation of the scales of words: BCLMWEIGHTEDFLOGW0. Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeAllMaxFFieldSetUTBm15FLogW0K00001
web_production: 1849
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. It is considered to be composational stream, consisting of an tokenized Url and a title of a document. The algorithm for aggregation of the scales of words: BM15FLOGW0. Normalization coefficient 0.0001.
|
QfufFilteredByXfOneSeAllMaxFTitleBm15K01
web_production: 1850
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. It is considered according to the heading of the document. The algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.1.
|
QfufFilteredByXfOneSeTopSumWFSumWFieldSet2Bm15FLogK0001
web_production: 1851
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Into aircraft association of the URLs, Title, Body, Correctedctr, Longclick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeTopSumWFSumWBodyMinWindowSize
web_production: 1852
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
OriginalRequestWordsFilteredByDssmSSHardFieldSet1Bm15FLogK0001
web_production: 1853
The factor for the filtered original request: the DSSM state from the request is calculated without words to the initial request, after which the threshold is cut off. Into aircraft association of the URLs, Title, Body, Links, Correctedctr, LongClick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
OriginalRequestWordsFilteredByDssmSSHardFieldSetUTBm15FLogW0K00001
web_production: 1854
The factor for the filtered original request: the DSSM state from the request is calculated without words to the initial request, after which the threshold is cut off. It is considered to be composational stream, consisting of an tokenized Url and a title of a document. The algorithm for aggregation of the scales of words: BM15FLOGW0. Normalization coefficient 0.0001.
|
FractionOfPresentedInTitleWordsWithWeightsByDssmSSHardModel
web_production: 1857
For all words of the request, the weight is calculated by the Query-Mutation method (the distance between the requests in nash and there is no word). The sum of the scales of the words found in the title is taken, divided by the sum of the scales of all words.
|
MaxWeightOfAbsentInTitleWordsWithWeightsByDssmSSHardModel
web_production: 1858
For all words of the request, the weight is calculated by the Query-Mutation method (the distance between the requests in nash and there is no word). Maximum weight is taken among words absent in the title of the document.
|
NeuroTextModelLongClickPredictorByWordAndBigramCountersWithoutTitleWithSSHards
web_production: 1859
The result of the use of a neural model, trained to distinguish long clicks from other events, the input of the model is the ambassadors and bigram meters calculated by text streams (Body, URL).
|
XfOneSeKnnAllMaxWFMaxWFieldSet1Bm15FLogK0001
web_production: 1864
Linguistic boosting factor. Type of extensions: XFONESEKNN (closest to the DSSM models trained to predict XFDTSHOW of extension). Aggregation on all extensions. The greatest balanced value of the factor. It is normalized for the maximum weight of expansion. Into aircraft association of the URLs, Title, Body, Links, Correctedctr, LongClick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
XfOneSeKnnAllMaxWFMaxWOneClickFullMatchValue
web_production: 1865
Linguistic boosting factor. Type of extensions: XFONESEKNN (closest to the DSSM models trained to predict XFDTSHOW of extension). Aggregation on all extensions. The greatest balanced value of the factor. It is normalized for the maximum weight of expansion. Todo Algorithm: The maximum weight of the completely coincided with the request of the annotation. It is considered according to Stream OneClick.
|
QueryToTextByXfOneSeKnnTopSumWFSumWBodyMinWindowSize
web_production: 1866
Linguistic boosting factor. Type of extensions: QuerytotextByxfoneKnn (Querytotext extensions of Xfoneeseknn extensions). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
QueryToTextByXfOneSeKnnAllSumWFSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1867
Linguistic boosting factor. Type of extensions: QuerytotextByxfoneKnn (Querytotext extensions of Xfoneeseknn extensions). Aggregation on all extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Rebelled association of streams Title, Body, LongClick, LongClicksp, OneClick. The algorithm for aggregation of the scales of words: BCLMWEIGHTEDFLOGW0. Normalization coefficient 0.001.
|
RequestMultitokensAllMaxFUrlBclmMixPlainKE5
web_production: 1910
Features calculated on url with request multitokens expansion
|
RequestMultitokensAllSumW2FSumWUrlExactQueryMatchAvgValue
web_production: 1911
Features calculated on url with request multitokens expansion
|
DmozQueryThemes
begemot_query_factors: 1
The theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer Dmoztheme)), only a few of the most popular topics are taken into account.
|
DmozQueryBestTheme
begemot_query_factors: 2
The most likely theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer DmozTheme)), only the most popular topics are taken into account (but there are more than in the DMOZQUREMES factor). The factor contains the likelihood of a correspondence of the request of the theme, but for each topic, its own interval is taken on the segment [0..1]
|
MusicQ
begemot_query_factors: 18
The musicality of the request. The results of the sorcerer Anton Konygin.
|
IsNavMxQuery
begemot_query_factors: 21
Rank 'navigation'
|
PornoQuery
begemot_query_factors: 23
Are there any words from Yweb/Pornofilter/Porno.query.
|
XPornoQuery
begemot_query_factors: 24
Classifier of Porn Causions, another dictionary than Pornoquery
|
QBlog
begemot_query_factors: 25
Whether the request of blog vocabulary contains
|
IsForeignQuery
begemot_query_factors: 27
Request is not in Russian
|
VideoQuery
begemot_query_factors: 28
Request about the video
|
GeoRegionalityUNew
begemot_query_factors: 32
Requestful factors - the result of the work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosya classifier of the request of the request)) - a new version of factors [328] - [328] - [328]: u - u - u - u - u - u - uceleless sites the request is meaningless;
|
GeoRegionalityRNew
begemot_query_factors: 33
Запросные факторы - результат работы ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy классификатора геолокализованности запроса)) - новая версия факторов [328]-[330]: R - георелевантные - региональные результаты в issuing could be useful, but nothing more;
|
GeoRegionalityVNew
begemot_query_factors: 34
Requestful factors - the result of the work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosy classifier of the request of the request)) - a new version of factors [328]: Vegetable fundamental importance.
|
RegNavQuery
begemot_query_factors: 35
Regional and navigation request - in the user region there are one or more navigation results on it
|
QueryThVideohosting
begemot_query_factors: 36
The result of the work of the lexical classifier of requests predicting the likelihood of click on the page 3973 page
|
QueryThEncyclopedic
begemot_query_factors: 37
The result of the work of the lexical classifier of requests predicting the likelihood of click on the theme of 3561
|
QrTur
begemot_query_factors: 38
The prediction of the share of “good” (at least two different cities and frequency> = 10) references to the request with geography in Turkey
|
NearbyQuery
begemot_query_factors: 39
When responding to a request, the results are important in close proximity ([pharmacies], [children's clinic])
|
CityQuery
begemot_query_factors: 40
When answering a request, the results within the city are important (the bulk of localized queries)
|
AdmQuery
begemot_query_factors: 41
When responding to a request, the results from the region, the region of the user ([airport], [dairy]) are important
|
IsGeo
begemot_query_factors: 42
It launches on the basic search under the name ISGEO the maximum weight of the meters of the gelator in the request. A geo-object is understood as an object of the category GEO, Geo1, Geoaddr, Geoaddr1, Landmark, Landmark1 (see ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects kaovsky allocation))))))))))))))))))))))))))))))). wiki.yandex-team.ru/arsengadzhikurbanov/wares Read more))
|
IsMusic
begemot_query_factors: 43
It launches on the basic searches under the name ISMUSIC the maximum weight of the Music or Music1 category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees more))))))))))))))))))))))
|
IsOrg
begemot_query_factors: 44
The request is the name of the organization (example: Gazprom, Gazprom) ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees Description))
|
IsHum
begemot_query_factors: 45
It launches on the basic search under the name ISHUM the maximum weight of the enclosed object of the Hum or Hum1 category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ishum more)))))
|
IsText
begemot_query_factors: 46
It launches on the basic search under the name ISTEXT the maximum weight of the TEXT or Text1 category of the category of the category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#istext more)))
|
IsPicture
begemot_query_factors: 47
It launches on the basic search under the name Ispicture the maximum weight of the Picture or Picture1 category of the category of the category of the category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ispicture))))))))))))))))))
|
MinOne
begemot_query_factors: 48
Returns the maximum degree of household objects in the request under the name Wminone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#minone more)))))
|
MaxOne
begemot_query_factors: 49
Returns the maximum degree of household objects in the request under the name Wmaxone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#maxone more)))))))
|
CountryQueryRegionality
begemot_query_factors: 50
Country classifier of localization - how much the request implies the context of the country
|
QClassDownload
begemot_query_factors: 51
= 1 - v. Download formula. Class requests: download/watch online/play/photo/listen
|
QClassBrandnames
begemot_query_factors: 52
The result of the classifier of the request - in the request there are words from the corresponding dictionary. brand
|
QClassDisease
begemot_query_factors: 53
Medication Dictionary
|
QClassKak
begemot_query_factors: 54
question
|
QClassMoscow
begemot_query_factors: 55
Specific request for Moscow
|
QClassOAO
begemot_query_factors: 56
organization
|
QClassPorno
begemot_query_factors: 57
porn
|
QClassTravel
begemot_query_factors: 58
trips
|
DiversityCategNeedPhoto
begemot_query_factors: 59
0 or 1, depending on the presence in the request of the clearly expressed intent Need_photo from the variety
|
DiversityCategNeedMap
begemot_query_factors: 60
0 or 1, depending on the presence in the request of the clearly expressed intent Need_map from the variety
|
DiversityCategDownload
begemot_query_factors: 61
0 or 1 - whether the request is matured by the tickt
|
DiversityCategReview
begemot_query_factors: 62
0 or 1 - whether the request is matured by the tickt
|
DiversityCategWatch
begemot_query_factors: 63
0 or 1 - whether the request is matured by the tickt
|
HasMisspell
begemot_query_factors: 64
There is a typo in the request
|
ErratumLogQueryProbability
begemot_query_factors: 65
Double logarithm of the probability of a request for a language model of the Erratum typo service
|
HasPornoQuery
begemot_query_factors: 79
The result of the work of Adult Rules for the Sorcerer.
|
DssmQueryCountryToUrlEstimatedDistance
begemot_query_factors: 122
Predicted by demand and country, using a DSSM model, the length of the click from this country.
|
LongQuerySyn
begemot_query_factors: 152
The factor is an analogue of LongQuery (the sum of the IDF words of the request), but with the 'correct' accounting of synonyms. Specifically, a minimum of IDF (i.e. the most frequent) of synonyms and words is selected.
|
QueryDoppMedianDwelltime
begemot_query_factors: 179
Median dwelltai request in history. Dwelltaym is cut to 6000. The request is normalized by doppelgangers
|
QueryDoppMultipleClicksShows
begemot_query_factors: 180
The number of shows of the request with more than one click in history. The request is normalized by doppelgangers
|
QueryDoppMultipleClicksProbability
begemot_query_factors: 181
The share of shows with more than one click from all shows in history. The request is normalized by doppelgangers
|
QueryDoppTimeFromPreviousPercentile25
begemot_query_factors: 191
25% time quantile from the previous request to the current one. The request is normalized by doppelgangers
|
Top11WorstKernelClusters
begemot_query_factors: 214
The query getting into the TOP 11 clusters on Kernel metric based on DSSM proximity.
|
IsCtrDssmClusterNumber34
begemot_query_factors: 239
Requests got into the 34th cluster based on CTR-DSSM.
|
IsSeoQuery
begemot_query_factors: 293
Indicator - is the request of the SEO -request
|
IsSeoQueryList1
begemot_query_factors: 294
Indicator - is the request of the SEO -request from the list No. 1
|
IsSeoQueryList2
begemot_query_factors: 295
Indicator - is the request of the SEO request from the list No. 2
|
HasFioInQuery
begemot_query_factors: 297
Request contains full name from FIO Rules
|
IsAliceMusicQuery
begemot_query_factors: 299
Musical request from Alice
|
IsArabicAliceMusicQuery
begemot_query_factors: 325
Musical request for the script of Arab Alice
|
InvWordCount
begemot_query_factors: 326
1/(1 + number of words of the request), attributes are not taken into account
|
HasAttrs
begemot_query_factors: 327
Is an attribute request (looks at rearr = attrs)
|
AbsolutePLM
collections_production: 3
|
Bclm
collections_production: 4
|
TxtBm25Sy
collections_production: 5
|
DocLen
collections_production: 6
|
Bclm2
collections_production: 7
|
Tocm
collections_production: 8
|
TitleTrigramsTitle
collections_production: 9
|
TextBM25_Fm_W1
collections_production: 10
|
TxtBm25Ex
collections_production: 11
|
TextBM25
collections_production: 12
|
TextBM25_Sy_W1
collections_production: 13
|
TxtHeadSy
collections_production: 14
|
YmwFull2
collections_production: 15
|
TxtHeadEx
collections_production: 16
|
TxtHead
collections_production: 17
|
TxtBreakSy
collections_production: 18
|
TxtBreakEx
collections_production: 19
|
QueryThEncyclopedic
images_l1: 84
The result of the work of the lexical classifier of requests predicting the likelihood of click on the theme of 3561
|
IsPicture
images_l1: 103
It launches on the basic search under the name Ispicture the maximum weight of the Picture or Picture1 category of the category of the category of the category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ispicture))))))))))))))))))
|
MaxOne
images_l1: 104
Returns the maximum degree of household objects in the request under the name Wmaxone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#maxone more)))))))
|
MinOne
images_l1: 105
Returns the maximum degree of household objects in the request under the name Wminone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#minone more)))))
|
QClassDownload
images_l1: 112
= 1 - v. Download formula. Class requests: download/watch online/play/photo/listen
|
DmozQueryThemes
images_new_l1: 1
The theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer Dmoztheme)), only a few of the most popular topics are taken into account.
|
DmozQueryBestTheme
images_new_l1: 2
The most likely theme of the request determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer DmozTheme)), only the most popular topics are taken into account (but there are more than in the DMOZQUREMES factor). The factor contains the likelihood of a correspondence of the request of the theme, but for each topic, its own interval is taken on the segment [0..1]
|
MusicQ
images_new_l1: 18
The musicality of the request. The results of the sorcerer Anton Konygin.
|
IsNavMxQuery
images_new_l1: 21
Rank 'navigation'
|
PornoQuery
images_new_l1: 23
Are there any words from Yweb/Pornofilter/Porno.query.
|
XPornoQuery
images_new_l1: 24
Classifier of Porn Causions, another dictionary than Pornoquery
|
QBlog
images_new_l1: 25
Whether the request of blog vocabulary contains
|
IsForeignQuery
images_new_l1: 27
Request is not in Russian
|
VideoQuery
images_new_l1: 28
Request about the video
|
GeoRegionalityUNew
images_new_l1: 32
Requestful factors - the result of the work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosya classifier of the request of the request)) - a new version of factors [328] - [328] - [328]: u - u - u - u - u - u - uceleless sites the request is meaningless;
|
GeoRegionalityRNew
images_new_l1: 33
Запросные факторы - результат работы ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy классификатора геолокализованности запроса)) - новая версия факторов [328]-[330]: R - георелевантные - региональные результаты в issuing could be useful, but nothing more;
|
GeoRegionalityVNew
images_new_l1: 34
Requestful factors - the result of the work ((http://wiki.yandex-team.ru/poiskovajaplatforma/lingvistika/zaprosnyjefefactory/localizovannyjezaprosy classifier of the request of the request)) - a new version of factors [328]: Vegetable fundamental importance.
|
RegNavQuery
images_new_l1: 35
Regional and navigation request - in the user region there are one or more navigation results on it
|
QueryThVideohosting
images_new_l1: 36
The result of the work of the lexical classifier of requests predicting the likelihood of click on the page 3973 page
|
QueryThEncyclopedic
images_new_l1: 37
The result of the work of the lexical classifier of requests predicting the likelihood of click on the theme of 3561
|
QrTur
images_new_l1: 38
The prediction of the share of “good” (at least two different cities and frequency> = 10) references to the request with geography in Turkey
|
NearbyQuery
images_new_l1: 39
When responding to a request, the results are important in close proximity ([pharmacies], [children's clinic])
|
CityQuery
images_new_l1: 40
When answering a request, the results within the city are important (the bulk of localized queries)
|
AdmQuery
images_new_l1: 41
When responding to a request, the results from the region, the region of the user ([airport], [dairy]) are important
|
IsGeo
images_new_l1: 42
It launches on the basic search under the name ISGEO the maximum weight of the meters of the gelator in the request. A geo-object is understood as an object of the category GEO, Geo1, Geoaddr, Geoaddr1, Landmark, Landmark1 (see ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects kaovsky allocation))))))))))))))))))))))))))))))). wiki.yandex-team.ru/arsengadzhikurbanov/wares Read more))
|
IsMusic
images_new_l1: 43
It launches on the basic searches under the name ISMUSIC the maximum weight of the Music or Music1 category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees more))))))))))))))))))))))
|
IsOrg
images_new_l1: 44
The request is the name of the organization (example: Gazprom, Gazprom) ((http://wiki.yandex-team.ru/arsengadzhikurbanov/warees Description))
|
IsHum
images_new_l1: 45
It launches on the basic search under the name ISHUM the maximum weight of the enclosed object of the Hum or Hum1 category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ishum more)))))
|
IsText
images_new_l1: 46
It launches on the basic search under the name ISTEXT the maximum weight of the TEXT or Text1 category of the category of the category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#istext more)))
|
IsPicture
images_new_l1: 47
It launches on the basic search under the name Ispicture the maximum weight of the Picture or Picture1 category of the category of the category of the category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ispicture))))))))))))))))))
|
MinOne
images_new_l1: 48
Returns the maximum degree of household objects in the request under the name Wminone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#minone more)))))
|
MaxOne
images_new_l1: 49
Returns the maximum degree of household objects in the request under the name Wmaxone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#maxone more)))))))
|
CountryQueryRegionality
images_new_l1: 50
Country classifier of localization - how much the request implies the context of the country
|
QClassDownload
images_new_l1: 51
= 1 - v. Download formula. Class requests: download/watch online/play/photo/listen
|
QClassBrandnames
images_new_l1: 52
The result of the classifier of the request - in the request there are words from the corresponding dictionary. brand
|
QClassDisease
images_new_l1: 53
Medication Dictionary
|
QClassKak
images_new_l1: 54
question
|
QClassMoscow
images_new_l1: 55
Specific request for Moscow
|
QClassOAO
images_new_l1: 56
organization
|
QClassPorno
images_new_l1: 57
porn
|
QClassTravel
images_new_l1: 58
trips
|
DiversityCategNeedPhoto
images_new_l1: 59
0 or 1, depending on the presence in the request of the clearly expressed intent Need_photo from the variety
|
DiversityCategNeedMap
images_new_l1: 60
0 or 1, depending on the presence in the request of the clearly expressed intent Need_map from the variety
|
DiversityCategDownload
images_new_l1: 61
0 or 1 - whether the request is matured by the tickt
|
DiversityCategReview
images_new_l1: 62
0 or 1 - whether the request is matured by the tickt
|
DiversityCategWatch
images_new_l1: 63
0 or 1 - whether the request is matured by the tickt
|
HasMisspell
images_new_l1: 64
There is a typo in the request
|
ErratumLogQueryProbability
images_new_l1: 65
Double logarithm of the probability of a request for a language model of the Erratum typo service
|
HasPornoQuery
images_new_l1: 79
The result of the work of Adult Rules for the Sorcerer.
|
LongQuerySyn
images_new_l1: 152
The factor is an analogue of LongQuery (the sum of the IDF words of the request), but with the 'correct' accounting of synonyms. Specifically, a minimum of IDF (i.e. the most frequent) of synonyms and words is selected.
|
IsText
images_production: 191
It launches on the basic search under the name ISTEXT the maximum weight of the TEXT or Text1 category of the category of the category met in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#istext more)))
|
QueryThEncyclopedic
images_production: 192
The result of the work of the lexical classifier of requests predicting the likelihood of click on the theme of 3561
|
QueryThVideohosting
images_production: 193
The result of the work of the lexical classifier of requests predicting the likelihood of click on the page 3973 page
|
IsPicture
images_production: 194
It launches on the basic search under the name Ispicture the maximum weight of the Picture or Picture1 category of the category of the category of the category in the request. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#ispicture))))))))))))))))))
|
QClassPorno
images_production: 195
porn
|
IsNavMxQuery
images_production: 196
Rank 'navigation'
|
MaxOne
images_production: 199
Returns the maximum degree of household objects in the request under the name Wmaxone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#maxone more)))))))
|
MinOne
images_production: 200
Returns the maximum degree of household objects in the request under the name Wminone. (See ((http://wiki.yandex-team.ru/alekseysokirko/queryobjects SOM-OV)))). ((http://wiki.yandex-team.ru/arsengadzhikurbanov/Wares#minone more)))))
|
QClassDownload
images_production: 345
= 1 - v. Download formula. Class requests: download/watch online/play/photo/listen
|
KinopoiskSuggestAllMaxWFMaxWTitleExactQueryMatchAvgValue
kp_text_machine: 0
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestTopMinWFMaxWTitleBclmMixPlainKE5
kp_text_machine: 1
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; The maximum weight of the extension. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
KinopoiskSuggestTopSumW2FSumWTitleExactQueryMatchAvgValue
kp_text_machine: 2
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: an abstract by square of expansion weight, multiplied by the value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxWFTitleExactQueryMatchAvgValue
kp_text_machine: 3
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxFTitleAttenV1Bm15K001
kp_text_machine: 4
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the heading of the document. The weight of the hit is multiplied by 1/ (1 + the position of the word in the sentence) an algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.01.
|
KinopoiskSuggestAllMaxFTitleWordCoverageExact
kp_text_machine: 5
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the heading of the document. The degree of covering the words of the request in the exact form.
|
KinopoiskSuggestTopMinWFTitleWordCoverageForm
kp_text_machine: 6
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; It is considered according to the heading of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
KinopoiskSuggestAllMaxWFSumWTitleExactQueryMatchAvgValue
kp_text_machine: 7
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllSumW2FSumWTitleExactQueryMatchAvgValue
kp_text_machine: 8
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: an abstract by square of expansion weight, multiplied by the value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxWFMaxWTitleCosineMatchMaxPrediction
kp_text_machine: 9
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. Algorithm Cosinematchmaxpredical.
|
KinopoiskSuggestTopMinWFSumWTitleExactQueryMatchAvgValue
kp_text_machine: 10
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllTotalW
kp_text_machine: 11
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Transferred the total weight of the extensions.
|
KinopoiskSuggestAllAvgW
kp_text_machine: 12
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. The average weight of extensions.
|
KinopoiskSuggestAllMinW
kp_text_machine: 13
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. The minimum expansion weight.
|
KinopoiskSuggestAllNumX
kp_text_machine: 14
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Transferred number of extensions according to the X / (X + 10) algorithm.
|
KinopoiskSuggestTopNumX
kp_text_machine: 15
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Transferred number of extensions according to the X / (X + 10) algorithm.
|
IsPorno
neural_network_over_dssm_factors: 0
Document from porn kitski
|
IsFake
neural_network_over_dssm_factors: 2
Fast document
|
IsEShop
neural_network_over_dssm_factors: 3
Commercial page (Classifier Savina)
|
IsForum
neural_network_over_dssm_factors: 4
URL satisfies forum_detector regularly
|
IsObsolete
neural_network_over_dssm_factors: 5
The URL has an ancient date. Ancient news are recognized. Factor 1 if there is a year in Url <= 2007.
|
HasPayments
neural_network_over_dssm_factors: 6
The page has a about 'payment SMS'.
|
ClickedWithAnotherSEClicks
neural_network_over_dssm_factors: 7
Clicks on the urlahs shown in the issuance for requests, by which they went to look for other search engines
|
ShowsWithAnotherSEClicks
neural_network_over_dssm_factors: 8
Urlov shows in the issuance for requests, by which they went to look for other search engines
|
EshopValue
neural_network_over_dssm_factors: 9
Stage of the page
|
PornoValue
neural_network_over_dssm_factors: 10
Pornography of the page
|
Poetry
neural_network_over_dssm_factors: 12
The poetry of the document
|
PoetryQuad
neural_network_over_dssm_factors: 13
The maximum poetry of the quatrain
|
SynS1
neural_network_over_dssm_factors: 14
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap1
neural_network_over_dssm_factors: 15
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap2
neural_network_over_dssm_factors: 16
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynPercentBadWordPairs
neural_network_over_dssm_factors: 18
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
SynNumBadWordPairs
neural_network_over_dssm_factors: 19
The proportion of bad steam among all found in the table: Z/(X+1), where Z is the number of bad couples in the text, and X is (http://wiki.yandex-team.ru/evgenijgrechnikov/testsynonimizers of 2000-navigable )) steam
|
NumLatinLetters
neural_network_over_dssm_factors: 20
The number of Latin letters in the text (not counting the markings) driven into [0.1] formula n/(n+100)
|
HasBigPicture
neural_network_over_dssm_factors: 21
The page has a big picture
|
RusWordsInText
neural_network_over_dssm_factors: 22
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
RusWordsInTitle
neural_network_over_dssm_factors: 23
The number of words of the Russian language in the title
|
MeanWordLength
neural_network_over_dssm_factors: 24
The average length of the word
|
PercentWordsInLinks
neural_network_over_dssm_factors: 25
The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
PercentVisibleContent
neural_network_over_dssm_factors: 26
The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
PercentFreqWords
neural_network_over_dssm_factors: 27
The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
PercentUsedFreqWords
neural_network_over_dssm_factors: 28
The number used in the text 500 of the most popular words of the language, divided by 500
|
TrigramsProb
neural_network_over_dssm_factors: 29
Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
TrigramsCondProb
neural_network_over_dssm_factors: 30
Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
NumeralsPortion
neural_network_over_dssm_factors: 31
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
ParticlesPortion
neural_network_over_dssm_factors: 32
The share of particles
|
AdjPronounsPortion
neural_network_over_dssm_factors: 33
The share of pronoun adjectives
|
AdvPronounsPortion
neural_network_over_dssm_factors: 34
The proportion of pronoun nouns
|
VerbsPortion
neural_network_over_dssm_factors: 35
The share of verbs
|
FemAndMasNounsPortion
neural_network_over_dssm_factors: 36
The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'hummingbirds' are an example of an indefinite kind that can be determined in two ways, 'Alexander' is homonymy).
|
HasLiRuCounter
neural_network_over_dssm_factors: 38
The presence of a LiveInternet meter
|
NumSlashes
neural_network_over_dssm_factors: 40
The number of slashes in Url
|
WatchVideo
neural_network_over_dssm_factors: 41
The presence of a built -in video player on the page
|
DownloadVideo
neural_network_over_dssm_factors: 42
Video for downloading
|
GskUrlModel
neural_network_over_dssm_factors: 43
The factor is calculated from the text of Url using the classifier of sequences Quality/Seq/GSK
|
SegmentAuxAlphasInText
neural_network_over_dssm_factors: 44
Number of letters in the AUX segment
|
SegmentAuxSpacesInText
neural_network_over_dssm_factors: 45
The number of spaces in the AUX segment
|
SegmentContentCommasInText
neural_network_over_dssm_factors: 46
The number of commas in the Content segment
|
UrlNGramsModel
neural_network_over_dssm_factors: 47
Urlngramsmodel ranking factor in ERF
|
StaticTitleBM25Ex
neural_network_over_dssm_factors: 48
BM25 page title by its text
|
TrashAdv
neural_network_over_dssm_factors: 49
The greasy of the page
|
CommRus
neural_network_over_dssm_factors: 50
The weight of the document on a monosyllabic dictionary of commercial vocabulary
|
Soft404
neural_network_over_dssm_factors: 55
Page - '404' (share of tokens '404' in relation to the total number of tokens on the page)
|
News
neural_network_over_dssm_factors: 56
This is the news (determined by the characteristic ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushichiekomponenty/klassificacionnye?v=tkd#h45859-3 Patterns in URL $))))).
|
Cat
neural_network_over_dssm_factors: 57
This is a catalog (determined by the characteristic ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayafformula/tekushhiekomponenty/klassificacionnye? .
|
IsIndexPage
neural_network_over_dssm_factors: 60
This is Index. (HTML/PHP/ASPX?/...), without CGI parameters. It is considered to be for all takes.
|
IsIndexPageSoft
neural_network_over_dssm_factors: 61
This is Index. (HTML/PHP/ASPX?/...), possibly with CGI parameters. It is considered to be for all takes.
|
IsOwner
neural_network_over_dssm_factors: 62
Whether the host is the owner, conditionally host == Owner (Host).
|
MinPathLen
neural_network_over_dssm_factors: 63
The minimum length of Pathandquery for all half -shoes.
|
TotalDups
neural_network_over_dssm_factors: 65
|
RusLang
neural_network_over_dssm_factors: 66
The language of the document is Russian.
|
AddTime
neural_network_over_dssm_factors: 67
The time of adding a page, more - a more old document; The root is placed from time displayed at the interval [0.1] so that 3+ years gives 1.
|
IsMainPage
neural_network_over_dssm_factors: 68
If the main page of the owner (most often a second -level domain, for example xxxx.ru), then the factor is 1. For bums, hosting, personal blogs, etc. (for example, Lifejornal, People.ru, etc.) - domains of the third level (such as xxxxx.narod.ru) will also have an equal factor 1.
|
Hops
neural_network_over_dssm_factors: 69
The number of hops of Url inpans (such as less - closer to the muzzle, the lower the value (0 - the muzzle, 1 - from the muzzle cannot be reached, 0 <can get from the muzzle <1). Normal value for the root of the nosta 0.0039).
|
Ukrainian
neural_network_over_dssm_factors: 70
It is equal to one if the site has a Ukrainian geoist (i.e. 1 - Ukrainian site)
|
IsBlog
neural_network_over_dssm_factors: 71
Page from the blogochosting
|
IsLivejournal
neural_network_over_dssm_factors: 72
Page with Livejournal.com
|
IsUnreachable
neural_network_over_dssm_factors: 73
The page is unattainable by the links from the muzzle.
|
NumNonLettersInUrl
neural_network_over_dssm_factors: 74
The number of 'Nebukv 'in Url
|
IsHub
neural_network_over_dssm_factors: 76
Habi page
|
AuraDocLogShared
neural_network_over_dssm_factors: 77
Logarithm of the number of shingles on which this document is not unique
|
AuraDocLogAuthor
neural_network_over_dssm_factors: 78
Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
AuraDocMeanSharedWeight
neural_network_over_dssm_factors: 80
The average weight of non-ugly shingles of this document
|
HasUserReviews
neural_network_over_dssm_factors: 82
The document contains user review/comment
|
HasDownloadLinkOnFile
neural_network_over_dssm_factors: 83
The document has a direct link to the file
|
HasDownloadLinkOnFileHosting
neural_network_over_dssm_factors: 84
The document has a link to filehosting
|
SegmentWordPortionFromMainContent
neural_network_over_dssm_factors: 86
The share of the words of the document from the segments with Score> 2.
|
WikiLinkCount
neural_network_over_dssm_factors: 87
|
NastyContent
neural_network_over_dssm_factors: 88
Content ugliness factor.
|
TextFeatures
neural_network_over_dssm_factors: 119
The quality of the text. It is considered a rather complex formula
|
TextLike
neural_network_over_dssm_factors: 120
Text quality (classifier Alekseev)
|
DocLen
neural_network_over_dssm_factors: 121
Document length in sentences
|
UrlLen
neural_network_over_dssm_factors: 122
The length of the URL, divided by 5
|
IsHTML
neural_network_over_dssm_factors: 123
Document type - HTML
|
EngLang
neural_network_over_dssm_factors: 136
Document language - English
|
LanguagePopularity
neural_network_over_dssm_factors: 138
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
UrlHasNoDigits
neural_network_over_dssm_factors: 139
There are no numbers in Urla
|
DaterStatsYearNormLikelihood
neural_network_over_dssm_factors: 140
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
DaterStatsAverageSourceSegment
neural_network_over_dssm_factors: 141
The arithmetic mean position of dates in the document. Temporarily disconnected
|
IsWiki
neural_network_over_dssm_factors: 142
page from ru.wikipedia.org
|
AdvAspam
neural_network_over_dssm_factors: 143
|
HostReliability
neural_network_over_dssm_factors: 163
The share of the Urlov that respond without errors
|
AddTimeMP
neural_network_over_dssm_factors: 181
The time for adding the main page of the owner (host?) Will be remaped like Addtime.
|
SeoInPayLinks
neural_network_over_dssm_factors: 186
The number of COO-Thrilling links between hosts
|
RankComGoodness
neural_network_over_dssm_factors: 193
Classifier for estimates of commercial sites
|
RankComGoodnessBar
neural_network_over_dssm_factors: 194
Classifier that approximate the quality of commercial sites based on user behavior data
|
RankBoostGoodness
neural_network_over_dssm_factors: 195
The rank of site quality used for boosts of the Moscow commercial formula
|
RandomLogHostHasPaymentsAvg
neural_network_over_dssm_factors: 212
AVG aggregation of HasPayments web factor using random log
|
RandomLogHostIsVideoQueryAvg
neural_network_over_dssm_factors: 213
AVG aggregation of VideoQuery web factor using random log
|
RandomLogHostSyntQualityAvg
neural_network_over_dssm_factors: 214
AVG aggregation of SyntQuality web factor using random log
|
RandomLogHostGeoRegionalityVNewPerc90
neural_network_over_dssm_factors: 215
PERCENTALE_90 aggregation of GeoRegionalityVNew web factor using random log
|
RandomLogHostQClassDownloadAvg
neural_network_over_dssm_factors: 216
AVG aggregation of QClassDownload web factor using random log
|
RandomLogHostIsMusicAvg
neural_network_over_dssm_factors: 217
AVG aggregation of IsMusic web factor using random log
|
RandomLogHostQueryThEncyclopedicPerc25
neural_network_over_dssm_factors: 218
PERCENTALE_25 aggregation of QueryThEncyclopedic web factor using random log
|
RandomLogHostCommercialOwnerRankRegAvg
neural_network_over_dssm_factors: 219
AVG aggregation of CommercialOwnerRank_Reg web factor using random log
|
RandomLogHostYabarWordDNGIPerc25
neural_network_over_dssm_factors: 220
PERCENTALE_25 aggregation of YabarWordDepthNodesGradientMin web factor using random log
|
RandomLogHostPopularSEFRCBrowserAvg
neural_network_over_dssm_factors: 221
AVG aggregation of PopularSEFRCBrowser web factor using random log
|
RandomLogHostURLClicksMaxGeoRegionFRCRatioAvg
neural_network_over_dssm_factors: 222
AVG aggregation of URLClicksMaxGeoRegionFRCRatio web factor using random log
|