Tag: TG_DOC_TEXT
(1203 ranking factors)
Factors |
---|
TR
web_production: 1
Text relevance (Maxfreq is the frequency of the most frequent word that makes sense of the length of the document).
|
PrBonus
web_production: 3
Weight: 0.07124278745128 Priority bonus, priority 7 - text priority. The binary factor, matters 0 for all monosyllabic requests, and the value of 1 for almost all two or more words, except for a very small number of answers for which there is not a single link that has passed quorum, and the text also did not pass the quorum.
|
TRp1
web_production: 4
Stript priority for TR is a text priority - there are all the words of the request somewhere in the document (while they pass contextual restrictions on the request, for example, both words DB in one sentence).
|
TRp2
web_production: 5
Weight: -0.109820338929289 PHRASE priority for TR is a text priority - there are all the words of the request in a row in the document.
|
TRtitle
web_production: 8
The presence of an accurate phrase (request text) in the header (more precisely, in the first sentence of the document). Contextual restrictions and feet are taken into account exactly as in TRP2, i.e. Factor [8] Minors Factor [5]
|
TRhr
web_production: 9
There was a plot that passed the quorum in which all the word positions are designated as those who have the relevance of Best_relev (title or Meta Keywords).
|
Long
web_production: 15
Weight: -0.084798680877042 Long document (the longer the document, the greater the value of the factor).
|
TRhitw
web_production: 16
Hitweigt is a variant of textual relevance, in which the weights of all hits are considered equal (i.e., they do not take into account the allowances for title and the proximity of words). In this case, the corresponding hits must be restricted by the syntactic sorcerer, i.e. We can assume that the TRHITW factor is 0 and only when Softandok is 0
|
PureText
web_production: 18
Long text without links.
|
SubqueryThMatch
web_production: 23
Coincidence of thematic spectra of request and document. Request themes-the result of work ((http://wiki.yandex-team.ru/evgenijjkroxalev/subquery Rules of the sorcerer Subquerysearch)) The subject of the document is taken from Yandex-Catalog
|
TRref
web_production: 25
The factor about the number of Refines. In the queries, there is a feature of user refines ('' word that is faced with a percentage sign '). According to the idea, this means something like 'it would be good if the word in the document was'. The only famous ((http://staff.yandex-team.ru/gulin Andrey Gulin)) the valuable use of this feature is a request [ %official %site name of the film]. This feature is unknown to users, because Not described in any documentation. It is planned that it will disappear from the tongue of requests, but in the sorcerer the words with the priority of User_refine will remain. The factor indicates how much the maximum user_refine was simultaneously found in the framework of a single hit in the quorum. It is believed that there are from 0 to 3 (if> 3, then it is believed that 3). This number is waved in the half interval [0.1)
|
TRboost
web_production: 26
The number for which some linseed factors are multiplied (namely, factors number 6, 7, 47, 66), if text relevant 0, and there are few links
|
TRLRlemma
web_production: 27
In textual relevance, Lemma coincides.
|
RelevSentsDssm
web_production: 29
DSSM model, trained for reformulations, in the document uses relevant to the request of the proposal
|
TRUnmapped
web_production: 39
TR divided by a cube of the number of words in a request and transformed by a standard REMAPTR.
|
RusLang
web_production: 40
The language of the document is Russian.
|
TextBM25
web_production: 46
Simple BM25 in text.
|
TLBM25
web_production: 48
Weight: 0.031399776481102 Simple BM25 in text and links at the same time.
|
TLp1
web_production: 49
All the words of the request are in the text + links.
|
TxtPair
web_production: 53
Weight: -0.020921642736537 Simple BM25 in pairs of words - we take all pairs of words of the request and consider the number of their entry into the text of the document. In the quality of the weight of the pair we use the sum of the scales of words. It does not work if there is a stop-word in the request
|
TxtBreak
web_production: 55
BM25 from the number of sentences in the document in which it occurs.
|
TxtHead
web_production: 56
Weight: -0.037878046829073 BM25 according to only in the heading.
|
TxtHiRel
web_production: 57
BM25 according to only with High Rel-bots ('significant', with the allocation (<b> ITP)).
|
HasNoTR
web_production: 61
The document has no TR.
|
TxtPairEx
web_production: 67
Weight: -0.00667940021707 the presence of pairs of words in the exact form
|
TxtBreakEx
web_production: 68
Weight: 0.024006117828321 the number of sentences in which there are many words in the exact form
|
TxtHeadEx
web_production: 69
Weight: -0.03957553241619 the presence of words in the header in the exact form
|
TxtHiRelEx
web_production: 70
BM25 in the exact form
|
TxtBm25Ex
web_production: 71
Simple BM25 in the exact form.
|
TxtPairSy
web_production: 72
Weight: -0.022152880819573 the presence of pairs of words taking into account synonyms (> = txtpair)
|
TxtBreakSy
web_production: 73
Weight: -0.116819481337211 the number of sentences in which there are many words taking into account synonyms
|
TxtHeadSy
web_production: 74
Weight: -0.012919083353605 the presence of words in the header, taking into account synonyms
|
TxtHiRelSy
web_production: 75
Weight: -0.039215257302626 BM25 taking into account synonyms
|
TxtBm25Sy
web_production: 76
Simple BM25 taking into account synonyms.
|
Megafon
web_production: 80
The relative frequency of the words in the links (1 - the words of the request are often found in links, 0.3 - rarely); More precisely, the value of this factor is pessimized provided: TR = 0 && LR = 0 & (there is not a single link with all the words of the request) && (did not pass the quorum) && (at least one pair of words of the request is found in the text)
|
BFexact
web_production: 91
There is an exact form of all words of the request in the text/lincers
|
BFlemma
web_production: 92
There is a lemma of all the words of the request in the text/lincers
|
SoftAndOk
web_production: 93
The document passed Softand on the restrictions of the syntactic sorcerer. Only for documents with textual relevance. For monosyllabic requests, always 1.
|
TextFeatures
web_production: 100
Weight: -0.016033504310566 The quality of the text. It is considered a rather complex formula
|
TextLike
web_production: 101
Weight: -0.094096848692163 Text quality (classifier Alekseeva)
|
DocLen
web_production: 110
Weight: -0.065128132003719 Document length in sentences
|
IsHTML
web_production: 114
Document type - HTML
|
IsPorno
web_production: 131
Document from porn kitski
|
IsComm
web_production: 132
Weight: -0.066463228806236 A document from a commercial clay. Not used (depreded)
|
IsFake
web_production: 133
Fast document
|
IsSEO
web_production: 134
The page title contains commercial vocabulary. Not used (depreded)
|
IsEShop
web_production: 136
Commercial page (Classifier Savina)
|
HasNoAllWordsTRSy
web_production: 138
The document does not have all the words of the request (with an accuracy to a synonym)
|
NumWordsTRSy
web_production: 139
The percentage of the words of the request in the document (with an accuracy to a synonym)
|
HasAllWordsTRSy
web_production: 140
The document has all the words of the request (with an accuracy to a synonym)
|
TxtInvPair
web_production: 144
Tr by pairs of words in the reverse order
|
TxtSkipPair
web_production: 146
Weight: -0.077504878926916 TR by pairs of words of the request through one word in texts
|
NumWordsTRFm
web_production: 148
The percentage of all the words of the request in the text (with an accuracy to the form)
|
HasAllWordsTRFm
web_production: 149
The document has all the words of the request (with an accuracy to the form)
|
TLen
web_production: 164
The length of the page text in the words tlen = map (number of words, 1/400), where map (x, y) = x*y / (1 + x*y)
|
ExactWordOrderLen
web_production: 180
The length of the maximum coincidence of forms in the text and request
|
ExactWordOrderWeight
web_production: 181
Weight of maximum coincidence of forms in the text and request
|
WordOrderLen
web_production: 182
The length of the maximum coincidence in the lemma in the text and request
|
WordOrderWeight
web_production: 183
The weight of the maximum coincidence by lemma in the text and request
|
TRp1All
web_production: 185
Options for relevant factors taking into account the feet of words
|
LRp1All
web_production: 186
Options for relevant factors taking into account the feet of words
|
TLp1All
web_production: 187
Weight: 0.055767877134775 Options for relevant factors taking into account the feet of words
|
BFexactAll
web_production: 188
Options for relevant factors taking into account the feet of words
|
BFlemmaAll
web_production: 189
Weight: 0.059222635368125 Options for relevant factors taking into account the feet of words
|
PassageLegacyTR
web_production: 190
Weight: 0.038806477920761 TR of the best passage - how high -quality snippet
|
TxtBM25AttenSyn
web_production: 191
Weight: 0.075434934641649 Tr with discount for suggestions
|
TRWithStops
web_production: 199
Weight of maximum coincidence of forms in the text and request
|
LRWithStops
web_production: 200
Weight of maximum coincidence of forms in the text and request
|
HasPayments
web_production: 201
The page has a about 'payment SMS'.
|
EshopValue
web_production: 203
Weight: -0.123814718900663 Stage of the page
|
PornoValue
web_production: 204
Pornography of the page
|
AuxTextBM25
web_production: 268
BM25 for the user region for localized queries, for the unflapped in Cuba, is a country. The texts of the queries sent for the regions can be viewed in Relev_regions.txt in the sorcerer
|
TRDocQuorum
web_production: 283
The weight of the words of the request that is in the text
|
TRLRDocQuorum
web_production: 285
The weight of the words of the request that is in the text and links
|
JokerLen
web_production: 297
We consider text features, believing that the page title is attributed to each of its proposal, i.e. The distance between the word from Title and any other word 1 sentence. Len is the maximum attitude of words from the request of the text met in some sentence (with attributed Title) in relation to the length of the request. Example [Harms Circus Vertunov] for ((http://wiki.yandex-team.ru//h.yandex.net/?http%3A%2F%2FWWWWIKILIVRES.info%2FWIKI%2F%25D0%25A6%25D %25b8%25D1%2580%25D0%25D0%25A %25BC%25D1%2581%of this document))
|
JokerWeight
web_production: 298
The ratio of the amount of IDF words in a sentence+Title to all words.
|
ExactJokerLen
web_production: 299
The same as Jokerlen, in the exact forms
|
ExactJokerWeight
web_production: 300
The same as Jokerweight, in the exact forms
|
Adultness
web_production: 312
equals 2 * NastyContent
|
Poetry
web_production: 319
The poetry of the document
|
PoetryQuad
web_production: 320
The maximum poetry of the quatrain
|
EngLang
web_production: 321
Document language - English
|
Has2ExactQueryParts
web_production: 322
The request is fully covered by two exact groups consisting of an exact Match of the words of a contract in a row ((http://wiki.yandex-team.ru/poiskovajaplatform/tr/coveragebygroups about coating in groups))
|
HasLevensht1QueryFragment
web_production: 323
There is a group consisting of an Exact Match of the words of the request that covers the request (possibly with a pass, addition or replacement of a word)
|
LargestSyInexactGroup
web_production: 324
Weight: -0.067337343351376 The share of the request, covered by the longest group consisting of any hits (including word forms and synonyms). Possibly with a pass, addition or replacement of a word
|
CyrLang
web_production: 327
The language of the document is Cyrillic
|
SynS1
web_production: 334
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap1
web_production: 335
Weight: 0.002431406823392 Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap2
web_production: 336
Weight: 0.08033186404617 Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
PageDate
web_production: 345
Weight: -0.034716206980983 The date of the document that is registered on the page is remarkable
|
HasTextPos
web_production: 350
The document has textual relevance
|
QSegmentsBM25
web_production: 351
Weight: -0.059299975637935 BM25, where the selected segments of the request act as 'words'
|
QSegmentsWeight
web_production: 352
Weight: -0.057628362537565 'Weight' of the segments of the request in the text
|
SynPercentBadWordPairs
web_production: 353
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
SynNumBadWordPairs
web_production: 354
The proportion of bad steam among all found in the table: Z/(X+1), where Z is the number of bad couples in the text, and X is (http://wiki.yandex-team.ru/evgenijgrechnikov/testsynonimizers of 2000-navigable )) steam
|
NumLatinLetters
web_production: 355
Weight: -0.086731079136512 The number of Latin letters in the text (not counting the markings), driven into [0.1] formula n/(n+100)
|
DocIdfSumFixed
web_production: 357
Previous factors - fixed
|
TitleIdfSumFixed
web_production: 358
Weight: 0.047164043400143 Previous factors - fixed
|
HeadingIdfSumFixed
web_production: 359
Weight: -0.068235863277027 Previous factors - fixed
|
NormalTextIdfSumFixed
web_production: 360
Previous factors - fixed
|
RusWordsInText
web_production: 364
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
RusWordsInTitle
web_production: 365
Weight: 0.03118624384934 The number of words of the Russian language in the title
|
MeanWordLength
web_production: 366
Weight: 0.019580616053835 The average length of the word
|
PercentWordsInLinks
web_production: 367
Weight: 0.057053549836014 The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
PercentVisibleContent
web_production: 368
Weight: -0.032828345615772 The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
PercentFreqWords
web_production: 369
Weight: -0.020210221137273 The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
PercentUsedFreqWords
web_production: 370
Weight: -0.063976585802142 The number used in the text 500 of the most popular words of the language, divided by 500
|
TrigramsProb
web_production: 371
Weight: -0.002170850269151 Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
TrigramsCondProb
web_production: 372
Weight: 0.026650508120317 Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
DaterAge
web_production: 380
Weight: -0.207437366708906 The difference between the current date and the date of the document defined by the dates, 1 - the date of the document is equal to the current, 0 - the document of 10 years or more, if the date is not defined, equal to 0. Attention! ((1 - dateraage)*60)^2 = age of the page In days.
|
TextMaxForms
web_production: 385
Weight: -0.015212586791057 The maximum number of forms in all words of the request is max in all words of the request request_form_dl_lov/64
|
TextWeightedForms
web_production: 386
Weight: 0.022803839020796 The sum of the number of forms balanced by the scales of words - the amount in all words of the request of the number_form_dly_lov/64*weight_lov; REMAP species x/(1 + x).
|
TextForms
web_production: 387
Weight: -0.008656938143421 The unwarmed amount of the number of forms is the amount in all words of the request of the number_form_dl_lov/64/number_lov_
|
TR_W1
web_production: 391
Analogues of the factors of the same name, the weight of the word = 1
|
TextBM25_Fm_W1
web_production: 393
Analogues of the factors of the same name, the weight of the word = 1
|
TextBM25_Sy_W1
web_production: 394
Analogues of the factors of the same name, the weight of the word = 1
|
TLBM25_W1
web_production: 396
Analogues of the factors of the same name, the weight of the word = 1
|
NumeralsPortion
web_production: 399
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
ParticlesPortion
web_production: 400
Weight: -0.012429221647235 The share of particles
|
AdjPronounsPortion
web_production: 401
Weight: -0.005976754416269 The share of pronoun adjectives
|
AdvPronounsPortion
web_production: 402
Weight: -0.001250755074786 The proportion of pronoun nouns
|
VerbsPortion
web_production: 403
The share of verbs
|
FemAndMasNounsPortion
web_production: 404
Weight: 0.011650367441796 The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'hummingbirds' are an example of an indefinite kind that can be determined in two ways, 'Alexander' is homonymy).
|
LongestText
web_production: 410
Weight: 0.069696682544392 The size of the largest text segment (from the factor [18] puretext)
|
DssmYaMusicASREarlyBindingCe
web_production: 436
DSSM model with early binding, trained on reforming and learned by ASR hypotheses of musical requests for Alice
|
DssmBertDistillSinsigCeCountryRegChain
web_production: 437
A model trained on a PRS-Law PRS to predict BERT, trained on sinsig_ce with threshold value 0.5, using a chain of regions to the country
|
DssmYaMusicEarlyBindingCe
web_production: 438
DSSM model with early binding, trained on reforming and learned on musical requests for Alice
|
Swbm25
web_production: 452
Weight: 0.019740981979634 Cunning BM25 in a sliding window. The size of the window is set in sentences. 'Jokers' are used for headlines and the beginning of the document. Morphological proximity and structure of the text are taken into account. The weight of the window fades with the removal from the beginning of the document.
|
PositionLanguageModel
web_production: 453
Weight: -0.032269052994315 The factor about that, a good snippet can turn out.
|
TxtPair_W1
web_production: 454
Weight: -0.016932610010322 Simple BM25 in pairs of words - we take all pairs of words of the request and consider the number of their entry into the text of the document. Weight = 1. It does not work if there is a stop-word in the request
|
AuraDocLogShared
web_production: 455
Weight: -0.097686304848915 Logarithm of the number of shingles on which this document is not unique
|
AuraDocLogAuthor
web_production: 456
Weight: -0.097277529611975 Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
AuraDocMeanSharedWeight
web_production: 457
Weight: -0.110593487056685 The average weight of non-ugly shingles of this document
|
LanguageCompliance
web_production: 469
Weight: 0.054576897612176 The language of the document corresponds to the language language
|
IsPornoAdvert
web_production: 477
On the Porn Advertising page
|
BM25FdPR_obsolete
web_production: 481
Weight: 0.054156294329288 BM25 with different parameters for different fields, including an incoming anchortekst. The weight of the text of the links included on the page is normalized depending on Delta Page Rank links
|
YmwFull
web_production: 492
Weight: -0.044940112806396 The size of the minimum piece of text, including all the words of the request found in the document. Not used now. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/ymw Read more))
|
Bclm
web_production: 493
Weight: 0.030786458206337 Buettcher, Clarke and Lushman factor (modified) ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushichiekomponenty/bclm more)))))))))
|
FieldLM
web_production: 495
Weight: 1.36522746e-7 Unigramal language model. Language is modeling according to the document, smoothed out by the general linguistic model. When building a model, the document uses information on which field of the document met the word request (Title, Head or Plain Text)
|
TitleTrigramsQuery
web_production: 501
Weight: 0.112928770384249 Calculates the coating of the request with letter trigrams of the document header
|
TitleTrigramsTitle
web_production: 502
Calculates the heading of the heading of the document header with letter trigrams
|
QueryWordSequencesTR
web_production: 504
Weight: -0.11860635115951 He considers the sum of the following species: the sequence of words of the request more than two, met in one sentence; It is normalized for the length of the document.
|
DmozThemeMatchAll
web_production: 511
Coincidence of the thematic spectrum (according to DMOZ) request and document. The theme of the request is determined ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 The rule of the sorcerer Dmoztheme))
|
DmozThemeMatchBest
web_production: 512
Coincidence of the thematic spectrum (according to DMOZ) request and document. The theme of the request is determined by the best result ((http://wiki.yandex-team.ru/jandekspoisk/zarubezhnyjjinternet/dmozqueryClassifier1 Rules for the sorcerer DmozTheme)) The subject of the document is determined by the automatic classifier
|
Mpsa
web_production: 513
Weight: 0.093045433292429 Evaluates the minimum distance between the pairs of words of the request, taking into account the remoteness of the pair from the beginning of the document (Minimal Pair Size with Attenuation). Steles are understood to mean all consistent bigrams of the words of the request. Thus, the number of vapor is equal to the number of words in a request reduced by 1. Accordingly, the factor makes sense for requests consisting of more than one word. (Http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/ Tekushhiekomponenty/MPSA MPSA))
|
Bclm2
web_production: 514
It differs from BCLM in that the weights of all words are considered the same. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/bclm2 BCLM2))))))))))))
|
AbsolutePLM
web_production: 515
Text relevant based on the language model, taking into account the absolute position. We go along the text with a window of 20 words, build a language model on each window (that is, the distribution of probabilities in the words of the Russian language) and calculate the probability of generating a request. For removal from the beginning of the document, we finish the model.
|
BclmLite
web_production: 522
Modification of the BCLM2 factor, lightweight for use in tulle. The main difference is that BCLMLite does not use absolute displacements of words relative to the beginning of the document. Instead, the factor works with the usual positions of the type <number of the_prising, position_v_production>. At the same time, the proximity between the words is taken into account only inside the sentence. (Http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaFormula/tekushichiekomponenty/bclmlite bclmlite)))))))))))))
|
YmwFull2
web_production: 527
Weight: -0.044940112806396 Fixed YMWFull. It differs from the previous version only by behavior on 2 -word queries. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/ymw Read more))
|
FullQuorum
web_production: 528
Binary factor, every word of the request is in the text or in the links
|
AuxCTextBM25
web_production: 529
'Country praets' (AUXQC)
|
AuxCLinkBM25
web_production: 530
'Country praets' (AUXQC)
|
Soft404
web_production: 531
Page - '404' (share of tokens '404' in relation to the total number of tokens on the page)
|
DBM25
web_production: 533
BM25, in which the weight of the word is machine -like
|
QueryWordCohesionTR
web_production: 534
Weight: -0.053739168786067 The factor evaluates as the words of the request is grouped with each other in the text of the document without taking into account their order. ((http://wiki.yandex-team.ru/sergejjkrylov/queryWordCohesionTR Description))
|
SegmentAuxAlphasInText
web_production: 542
Weight: 0.010581678208134 Number of letters in the AUX segment
|
SegmentAuxSpacesInText
web_production: 543
Weight: -0.011681967583253 The number of spaces in the AUX segment
|
SegmentContentCommasInText
web_production: 544
The number of commas in the Content segment
|
IsShop
web_production: 545
Weight: -0.133931985443449 Page is a store. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/opisanijafaktorov#SSHOP Description)). Not used (depreded)
|
AuraDocLogOrigin
web_production: 547
Logarithm of the number of shingles in the document added by the owner of the site as original texts in ((http://wiki.yandex-team.ru/jandekspoisk/jekosistema/marketingPr/webmasters/plan/vtorcontect of originality plugin)). It does not participate in the formula, it is needed to disconnect the takes
|
AuraDocMeanFltAuthorSource
web_production: 548
The average filtered number of sources of authorship of the document. It does not participate in the formula, it is needed to disconnect the takes
|
IdfVariance
web_production: 551
Weight: 0.025691573951246 Dispersion of IDF words,
|
NationalLanguage
web_production: 553
The language of the document corresponds to the country's request
|
FiltrationSegments
web_production: 561
The share of the segments of the request present in the text
|
LanguageGoodForTurkey
web_production: 562
The language of the document is one of the permissible for Turkey (Turkish, English, German, French, Arabic, Azerbaijani) or the document has zero length. In the search stage is calculated only for Isrealgeolocal requests.
|
DBM25_2
web_production: 563
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
BM25FdPRFixed
web_production: 566
Weight: 0.058870258158539 BM25FDPR with standardization on the average length of the document, depending on the language of the document. ((http://wiki.yandex-team.ru/bm25frework test results.))
|
LanguagePopularity
web_production: 567
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
QueryDOwnerWeightedSumFRCAndBM25FdPRFixed
web_production: 568
Weight: 0.087850313290757 The amount of factors QueryDownerClicksFRC and BM25FDPRFIXED with scales 0.358449 and 0.184922, respectively. '565' in the name of the factor does not need to be perceived literally, it is Legashi or a typo.
|
Tocm
web_production: 572
Weight: -0.005028751679547 The factor evaluates the differences in the positions of words in the heading from the posterity in the request
|
DBM30Smerch
web_production: 576
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
DssmBertDistillL2
web_production: 579
A pool of logs is marked with BERT trained on Sinsig. DSSM model is trained on this pool using BaseregionChain
|
StaticTitleComm
web_production: 583
The degree of commerce page title. Not used (depreded)
|
StaticTitleBM25Ex
web_production: 584
Weight: 0.016179974819787 BM25 page title by its text
|
StaticTitleLRBM25
web_production: 585
Weight: 0.038263040612831 BM25 page title by texts of links to it
|
TitleInLinksTrigrams
web_production: 597
Weight: -0.076334972364641 The share of unique trigrams in the trigrams of links
|
LinksInTitleTrigrams
web_production: 598
Weight: 0.019301158836494 Share of unique trigrams of links in trigrams header
|
TrashAdv
web_production: 599
The greasy of the page
|
DBM35
web_production: 606
Weight: 0.046757967567051 BM25 in texts and links with special. Libra in the level of coincidence (shape, lemma, synonym)
|
TRLRQuorumFm
web_production: 607
Weight: -0.062810308974889 The weight of the words of the request that is in the text in the exact form
|
TRLRQuorumLemma
web_production: 608
Weight: -0.003021983245146 The weight of the words of the request that is in the text with an accuracy to lemma
|
TRLRQuorumSyn
web_production: 609
The weight of the words of the request that is in the text
|
SmallWindow
web_production: 621
Maximum amount weight of the words of the request in the window of 50 words
|
FooterInLinksTrigrams
web_production: 648
The share of unique trigrams of a footer fragment in trigrams of links
|
LinksInFooterTrigrams
web_production: 649
The share of unique trigrams of links among a fragment of trigrams of a footer
|
DBM40
web_production: 652
Variation of Temo ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/DBM25 dBM25), cm.
|
BM25_0
web_production: 654
Variation on the topic BM25
|
BM25_1
web_production: 655
Variation on the topic BM25
|
BM25_0123
web_production: 656
Variation on the topic BM25
|
DBMNumbers
web_production: 662
DBM separately by numbers
|
DBMGeo
web_production: 663
DBM separately by geo-objects of request
|
DBMSubstantive
web_production: 664
DBM separately on the noun
|
Bocm
web_production: 668
Evals the correspondence of the positions of words in the sentences of the document to the positions of words in the request.
|
FioMatch
web_production: 670
The document contains a name from the request.
|
HasDownloadLinkOnFile
web_production: 682
The document has a direct link to the file
|
HasDownloadLinkOnFileHosting
web_production: 683
The document has a link to filehosting
|
BclmMax
web_production: 696
The proximity of the words of the request to the most difficult word.
|
HasUserReviews
web_production: 698
The document contains user review/comment
|
DBM15Wares
web_production: 703
|
DocCreateMonth
web_production: 705
The time of creating a document with an accuracy of 1.0 is the current month, 0- 10 years ago and older. Temporarily disconnected
|
DocUpdateMonth
web_production: 706
The time for updating the document with an accuracy of 1.0 is the current month, 0- 10 years ago and older. Temporarily disconnected
|
DaterStatsYearNormLikelihood
web_production: 709
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
DaterStatsAverageSourceSegment
web_production: 712
The arithmetic mean position of dates in the document. Temporarily disconnected
|
DBM15Wares2
web_production: 713
|
Cabm
web_production: 714
BM with attenuation in the text of catalog links.
|
SegmentWordPortionFromMainContent
web_production: 723
The share of the words of the document from the segments with Score> 2.
|
SmallWindowAttenuation
web_production: 734
|
WeightedSumIsIndexPageBocm
web_production: 762
|
AuxTitleBM25
web_production: 770
TEXTBM25 is considered in the title by the text of the name of the user region - similar to the factor 268.
|
Bclmf
web_production: 771
BCLM for Annotation index, doc text and links.
|
CommercialDssmOddLike
web_production: 812
Finetuned reformulations DSSM to commercial clicked bargain odd-like target from visit log
|
FioFromOriginalRequestBodyChain0Wcm
web_production: 820
The factor according to the name from the original request is considered according to the contents of the document. Algorithm: Chain0wcm
|
DssmNavigationL2
web_production: 859
Request and documentary navigation model.
|
SmallWindowAttenuationQ
web_production: 865
|
QueryDocTitleRangesMatchingScore
web_production: 866
The factor on the text of the request and heading (Title) of the document, assessment of the compliance of numerical ranges in words-markers
|
FioFromOriginalRequestBodyMinWindowSize
web_production: 873
The factor according to the name from the original request is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
FioFromOriginalRequestTextCosineMatchMaxPrediction
web_production: 874
Factor for name from the original request text of the document. Algorithm Cosinematchmaxpredical.
|
AllFioFromOriginalRequestAllMaxFBodyChain0Wcm
web_production: 875
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. Algorithm: Chain0wcm
|
AllFioFromOriginalRequestAllMaxFBodyMinWindowSize
web_production: 876
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
AllFioFromOriginalRequestAllMaxFTextCosineMatchMaxPrediction
web_production: 882
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; The text of the document. Algorithm Cosinematchmaxpredical.
|
AliceClickDssm
web_production: 900
DSSM CLOSE DISCOUNT according to data specific for Alice
|
TelFullAttributeTextBocm15K001
web_production: 901
The factor for telephone attributes Tel_Full from the original request text of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
AliceTimespentSuffixSum
web_production: 957
The prediction of the total time spent to the end of the session, provided that this pair is implemented by the request-document
|
AliceTimespent
web_production: 958
The prediction of the contribution of this pair request-document to the timetable
|
AliceMaxPercentPlayed
web_production: 965
The prediction of the percentage of the length of the track, which will be lost subject to the implementation of this pair of the request
|
XfDtShowAllMaxFFieldSet2Bm15FLogK0001
web_production: 1025
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BM15 in the group of streams 2. The maximum value of the factor for extensions.
|
XfDtShowAllMaxFFieldSet3BclmWeightedFLogW0K0001
web_production: 1026
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The maximum value of the expansion factor.
|
XfDtShowAllMaxFFieldSetUTBm15FLogW0
web_production: 1027
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BM15FLOGW0 for Urlu and Title. The maximum value of the expansion factor.
|
XfDtShowAllMaxFTextCosineMatchMaxPrediction
web_production: 1028
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: CosinemaxMatchprediction in text and Title. The maximum value of the expansion factor.
|
XfDtShowAllSumW2FSumWFieldSet1Bm15FLogK0001
web_production: 1032
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BM15FLOG by the Stream group 1. The average balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) for extensions.
|
XfDtShowAllSumW2FSumWFieldSetUTBm15FLogW0
web_production: 1033
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BM15FLOGW0 for Urlu and Title. The average balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) for extensions.
|
XfDtShowAllSumWFSumWBodyMinWindowSize
web_production: 1034
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: Minwindowsize in text. The average balanced values ​​of the expansion factor.
|
XfDtShowBagOfWordsFieldSetBagOfWordsOriginalRequestFractionExact
web_production: 1035
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: ORIGINALREQUARY ORIGINALREKETRACTRENEXACT for a group of streams for bag factors (text, Title, annotation streams).
|
XfDtShowBagOfWordsTitleCosineMaxMatch
web_production: 1039
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: CosinemaxMattcg bag.
|
XfDtShowTopMinWFFieldSet3BclmWeightedFLogW0K0001
web_production: 1040
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The minimum balanced value of the factor for the expansion top.
|
XfDtShowTopSumW2FSumWBodyChain0Wcm
web_production: 1043
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: Chain0wcm in text. The average balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) according to the expansion top.
|
XfDtShowTopSumWFSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1046
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The average balanced values ​​of the factor for the expansion top.
|
TitleBm15K01
web_production: 1092
Bm15K01 factor over hits from Title
|
TitleBocm15K001
web_production: 1093
Bocm15K001 factor over hits from Title
|
TextBm11Norm16384
web_production: 1094
Bm11Norm16384 factor over hits from Text
|
TextBocm11Norm256
web_production: 1095
Bocm11Norm256 factor over hits from Text
|
TextCosineMatchMaxPrediction
web_production: 1096
CosineMatchMaxPrediction factor over hits from Text
|
FieldSet1Bm15FLogK0001
web_production: 1097
Bm15FLogK0001 factor over hits from FieldSet1 stream
|
FieldSet2Bm15FLogK0001
web_production: 1098
Bm15FLogK0001 factor over hits from FieldSet2 stream
|
FieldSet3BclmWeightedFLogW0K0001
web_production: 1099
BclmWeightedFLogW0K0001 factor over hits from FieldSet3 stream
|
FieldSetUTBm15FLogW0K00001
web_production: 1100
Bm15FLogW0K00001 factor over hits from FieldSetUT stream
|
BodyChain0Wcm
web_production: 1101
Chain0Wcm factor over hits from Body
|
BodyPairMinProximity
web_production: 1102
PairMinProximity factor over hits from Body
|
BodyMinWindowSize
web_production: 1103
MinWindowSize factor over hits from Body
|
DssmLongMiddleShortVsHardClicks
web_production: 1219
DSSM model trained on clicks.
|
DssmLongVsMiddleShortNoClicks
web_production: 1220
DSSM model trained on clicks.
|
DssmMiddleVsShortLongHardNoClicks
web_production: 1221
DSSM model trained on clicks.
|
DssmShortVsMiddleLongHardNoClicks
web_production: 1222
DSSM model trained on clicks.
|
DssmNOVsShortMiddleLongHardClicks
web_production: 1223
DSSM model trained on clicks.
|
DssmLongVsShortMiddleHardClicks
web_production: 1224
DSSM model trained on clicks.
|
DssmMiddleLongVsShortHardClicks
web_production: 1225
DSSM model trained on clicks.
|
DssmShortMiddleLongVsHardNoClicks
web_production: 1226
DSSM model trained on clicks.
|
Medical2UrlQuality
web_production: 1227
Neural model of content quality for medical subjects
|
Medical2UrlQualityFresh
web_production: 1244
Neural model of content quality for medical subjects (for ex -)
|
FinLawUrlQuality
web_production: 1247
Neural model of content quality for financial and legal topics
|
FinLawUrlQualityFresh
web_production: 1249
Neural model of content quality for financial and legal topics (for exposures)
|
RequestWithRegionNameTextBm11Norm16384
web_production: 1255
Linguistic boosting factor. Type of extensions: Requestwithregionname. BM11 in the text and the Title of the Document
|
RequestWithRegionNameTextCosineMatchMaxPrediction
web_production: 1256
Linguistic boosting factor. Type of extensions: Requestwithregionname. Cosinematchmaxprediction on the text and dump title
|
RequestWithRegionNameFieldSet1Bm15FLogK0001
web_production: 1263
Linguistic boosting factor. Type of extensions: Requestwithregionname. Factor: BM15 in the group of streams 1.
|
RequestWithRegionNameFieldSet2Bm15FLogK0001
web_production: 1264
Linguistic boosting factor. Type of extensions: Requestwithregionname. Factor: BM15 in the group of streams 2.
|
RequestWithRegionNameFieldSet3BclmWeightedFLogW0K0001
web_production: 1265
Linguistic boosting factor. Type of extensions: Requestwithregionname. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3.
|
RequestWithRegionNameBodyChain0Wcm
web_production: 1266
Linguistic boosting factor. Type of extensions: Requestwithregionname. Chain0WCM factor on the text of the document
|
SosUrlQuality
web_production: 1268
Neural model of content quality for SOS topics
|
SosUrlQualityFresh
web_production: 1270
Neural model of content quality for SOS subjects (for ex -)
|
AliceTimespentSum
web_production: 1273
Prediction of the time of the session, provided that this pair is requested by the request-document
|
DssmSinsigL2
web_production: 1278
Request-document model Sinsiga.
|
OriginalRequestTitleBclmMixPlainKE5
web_production: 1281
The factor for the original request. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
OriginalRequestTitleCMMatchTop5AvgMatchValue
web_production: 1282
The factor for the original request. It is considered according to the heading of the document. CMMATCHTOP5AVGMATCHVALUE algorithm.
|
OriginalRequestTitleWordCoverageForm
web_production: 1283
The factor for the original request. It is considered according to the heading of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
OriginalRequestTitleAttenV1Bm15K05
web_production: 1284
The factor for the original request. It is considered according to the heading of the document. The weight of the hit is multiplied by 1/ (1 + the position of the word in the sentence) an algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.5.
|
OriginalRequestBodyBclmMixPlainKE5
web_production: 1285
The factor for the original request. It is considered according to the contents of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
OriginalRequestBodyCosineMatchMaxPrediction
web_production: 1286
The factor for the original request. It is considered according to the contents of the document. Algorithm Cosinematchmaxpredical.
|
OriginalRequestBodyAllWcmWeightedPrediction
web_production: 1287
The factor for the original request. It is considered according to the contents of the document. Algorithm Allwcmweightedpredical.
|
OriginalRequestBodyBocm15K001
web_production: 1288
The factor for the original request. It is considered according to the contents of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
OriginalRequestBodyQueryPartMatchSumValueAny
web_production: 1289
The factor for the original request. It is considered according to the contents of the document. Algorithm: Querypartmatchsumvalueany.
|
OriginalRequestBodyWordCoverageForm
web_production: 1290
The factor for the original request. It is considered according to the contents of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
OriginalRequestBodyWordCoverageExact
web_production: 1291
The factor for the original request. It is considered according to the contents of the document. The degree of covering the words of the request in the exact form.
|
OriginalRequestBodyBm15MaxAnnotationK001
web_production: 1292
The factor for the original request. It is considered according to the contents of the document. Libra Agnregation algorithm: BM15Maxannotation normalization coefficient 0.01.
|
DssmLogDwellTimeBigrams
web_production: 1338
DSSM model trained on clicks. Takes bigrams into account.
|
XfDtShowTopSumW2FSumWFieldSet5AvgPerTrigramMaxValueAny
web_production: 1352
Linguistic boosting factor. Type of extensions: XFDTSHOW. Factor: AVGPERGRAGRAMMAXVALEANY in the Stream group 5. The average balanced values ​​of the factor for the expansion top.
|
DssmLogDwelltimeBigramsL2
web_production: 1354
DSSM model trained on clicks. Takes bigrams into account. Embeddings for documents are computed offline.
|
DssmBigramsQueryDerivativeMin
web_production: 1356
A minimum of gradients according to the Bigramm LogdwellTime model.
|
DssmBigramsQueryDerivativeMax
web_production: 1357
Maximum from gradients according to the Bigramm Logdwelltime model.
|
DssmBigramsQueryDerivativeMoment2Central
web_production: 1358
The second central moment (dispersion) from gradients according to the Bigramm Logdwelltime model.
|
DssmBigramsQueryDerivativeMoment3Central
web_production: 1359
The third central moment from gradients according to the Bigramm Logdwelltime model.
|
QfufTopSumWFSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1390
Linguistic boosting factor. Type of extensions: QFUF. Factor: BCLMWEIGHTEDFLOGW0_K0.001 FIELDSET3. The average balanced values ​​of the TOP-10 factor by extensions.
|
QueryToTextAllSumWFSumWBodyMinWindowSize
web_production: 1391
Linguistic boosting factor. Type of extensions: Querytotext. Factor: by minwindowsize according to the contents of the document. The average balanced values ​​of the expansion factor.
|
QueryToTextTopMinWFBodyMinWindowSize
web_production: 1394
Linguistic boosting factor. Type of extensions: Querytotext. Factor: Minwindowsize according to the contents of the document. The average balanced values ​​of the TOP-10 factor by extensions.
|
QfufAllMaxFFieldSetUTBm15FLogW0K00001
web_production: 1395
Linguistic boosting factor. Type of extensions: QFUF. Factor: BM15FLOGW0_K0.0001 on Ural and the heading. The maximum value of the expansion factor.
|
QfufAllSumWFSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1396
Linguistic boosting factor. Type of extensions: QFUF. Factor: BCLMWEIGHTEDFLOGW0_K0.001 FIELDSET3. The average balanced values ​​of the expansion factor.
|
QueryToTextAllSumFCountBodyPairMinProximity
web_production: 1398
Linguistic boosting factor. Type of extensions: Querytotext. Factor: PairminProximity according to the contents of the document. The average values ​​of the expansion factor.
|
QueryToTextAllSumFCountTextBocm11Norm256
web_production: 1400
Linguistic boosting factor. Type of extensions: Querytotext. Factor: Bocm11_norm256 according to the text of the document. The average values ​​of the expansion factor.
|
QfufAllMaxFTextCosineMatchMaxPrediction
web_production: 1401
Linguistic boosting factor. Type of extensions: QFUF. Factor: COSINEMATCHMAXPRECTION on the text of the document. The maximum value of the expansion factor.
|
QfufTopSumW2FSumWFieldSet1Bm15FLogK0001
web_production: 1402
Linguistic boosting factor. Type of extensions: QFUF. Factor: BM15FLOG_K0.001 according to Fieldset1. The average balanced values ​​of the factor with a quadratic weight in the top 10 in terms of factor value by extensions.
|
QfufAllMaxFTextBocm11Norm256
web_production: 1403
Linguistic boosting factor. Type of extensions: QFUF. Factor: Bocm11_norm256 according to the text of the document. The maximum value of the expansion factor.
|
QfufTopSumWFSumWFieldSetUTBm15FLogW0K00001
web_production: 1404
Linguistic boosting factor. Type of extensions: QFUF. Factor: BM15FLOGW0_K0.0001 on Ural and the heading. The average balanced values ​​of the expansion factor.
|
DssmOneClickProbability
web_production: 1405
DSSM model trained on clicks, target=OneClicks/Clicks. Takes bigrams into account.
|
DssmQueryDwellTime
web_production: 1406
DSSM model trained on clicks, target=QueryDwellTime stream value. Takes bigrams into account.
|
AllMatchedWordWeightsSum
web_production: 1407
The normalized amount of the scales of the words of the request that met in the text of the document or links to it.
|
StringMatchedWordWeightsSum
web_production: 1408
The normalized amount of the scales of the words of the request that Equal_by_String in the text of the document or links to it.
|
AllMatchedWordWeightsSumText
web_production: 1409
The normalized amount of the scales of the words of the request that met in the text of the document.
|
AllMatchedWordWeightsSumLink
web_production: 1410
The normalized amount of the scales of the words of the request that met in the links to the document.
|
StringMatchedWordWeightsSumLink
web_production: 1411
The normalized amount of the scales of the words of the request that Equal_by_String in the links to the document.
|
AllMatchedWordFiltrationModelWeightsSum
web_production: 1412
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in the text of the document or links to it.
|
StringMatchedWordFiltrationModelWeightsSum
web_production: 1413
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which are Equal_by_String in the text of the document or links to it.
|
LemmaMatchedWordFiltrationModelWeightsSum
web_production: 1414
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_lemma in the text of the document or links to it.
|
AllMatchedWordFiltrationModelWeightsSumLink
web_production: 1415
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in links to the document.
|
StringMatchedWordFiltrationModelWeightsSumLink
web_production: 1416
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_String in the links to the document.
|
DssmLanguageClassifierRusL2
web_production: 1425
Document DSSM model Language Classifier Rus.
|
DssmLanguageClassifierEngL2
web_production: 1426
Document DSSM model Language Classifier Eng.
|
DssmLanguageClassifierOthL2
web_production: 1427
Document DSSM model Language Classifier Other.
|
alice_aramusic_dssm
web_production: 1430
|
AliceMusicRelevanceDssm
web_production: 1431
DSSM Prediction to determine Alice's irrelevant answers
|
BM25FdPRFixedNoLinks
web_production: 1462
BM25FDPR with standardization on the average length of the document, depending on the language of the document. Only texts are used.
|
NoApproxSmallWindowAttenuation
web_production: 1470
|
NoApproxSmallWindowAttenuationQ
web_production: 1471
|
DssmMainContentKeywords
web_production: 1472
Query-MainContentKeywords similarity, target: logDwellTime
|
DssmCtrNoMiner
web_production: 1504
DSSM model trained on CTRs without miner.
|
DssmQueryUrlTitleRegChainClicksOdd
web_production: 1513
DSSM model trained on click odd pool
|
DssmQueryUrlTitleRegChainClicksPers
web_production: 1514
DSSM model trained on click personalization pool
|
DssmQueryUrlTitleRegChainClicksTrFull
web_production: 1515
DSSM model trained on click triangle pool
|
DssmLogDtBigramsAMHardQueriesNoClicks
web_production: 1523
DSSM model trained on clicks without miner (with no-clicks and AM-hard negatives). Takes bigrams into account.
|
XfDtShowKnnAllMaxWFFieldSet3BclmWeightedFLogW0K0001
web_production: 1573
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The maximum balanced value of the factor.
|
XfDtShowKnnAllMaxWFFieldSet2Bm15FLogK0001
web_production: 1574
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG in the group of streams 2. The maximum balanced value of the factor.
|
XfDtShowKnnBagOfWordsFieldSetBagOfWordsOriginalRequestFraction
web_production: 1575
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: ORIGINALREQUENTFRACTFRACTION OF THE FIELDSETBAGOFWORDS Stream.
|
XfDtShowKnnAllMaxWFSumWQueryDwellTimeMixMatchWeightedValue
web_production: 1576
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: MixmatchweightedValue by Stream Querydwelltime. The maximum balanced value of the factor is normalized for the total weight.
|
XfDtShowKnnAllSumW2FSumWTitleBm15K01
web_production: 1577
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15 according to Stream Title. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) normalized for total weight.
|
XfDtShowKnnTopMinFFieldSet3BclmWeightedFLogW0K0001
web_production: 1578
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The minimum value of the factor for the expansion top.
|
XfDtShowKnnAllSumW2FSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1579
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BCLMWEIGHTEDFLOGW0 in the Stream group 3. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) normalized for the total weight.
|
XfDtShowKnnAllMaxWFFieldSet1Bm15FLogK0001
web_production: 1580
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG in the Stream group 1. The maximum balanced value of the factor.
|
XfDtShowKnnAllSumWFSumWFieldSet1Bm15FLogK0001
web_production: 1581
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG in the Stream group 1. The total balanced value of the factor is normalized for the total weight.
|
XfDtShowKnnBagOfWordsLongClickSPAnnotationMatchAvgValue
web_production: 1582
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: Bag AnnotationMatChavgvalue by Stream LongClicksp.
|
XfDtShowKnnTopSumW2FSumWFieldSet1Bm15FLogK0001
web_production: 1583
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG for the Stream group 1. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) for expansion top extensions, standardized for the total weight of the expansion top.
|
XfDtShowKnnTopMinWFMaxWFieldSet1Bm15FLogK0001
web_production: 1584
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG in the Stream group 1. The minimum balanced value of the factor for the expansion top extensions normalized for the maximum weight by the expansion top.
|
XfDtShowKnnAllMaxWFSumWBodyPairMinProximity
web_production: 1585
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: PairminProximity according to Stream Body. The maximum balanced value of the factor is normalized for the total weight.
|
XfDtShowKnnAllSumW2FSumWFieldSet1Bm15FLogK0001
web_production: 1586
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: BM15FLOG for the Stream group 1. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) normalized for total weight.
|
XfDtShowKnnBagOfWordsSimpleClickAnnotationMatchAvgValue
web_production: 1587
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: SIMPLECLIC SIMPLECLICS bag.
|
XfDtShowKnnBagOfWordsTitleCosineMaxMatch
web_production: 1588
Linguistic boosting factor. Type of extensions: XFDTSHOWKNN. Factor: CosinemaxMatch bag according to Title Stream.
|
DssmLogDtBigramsAMHardQueriesNoClicksMixed
web_production: 1596
DSSM model trained on clicks without miner (with no-clicks and am_hard negatives 50/50 and then on am_hard negatives only). Takes bigrams into account.
|
QueryToTextByXfDtShowKnnAllSumW2FSumWTextBocm11Norm256
web_production: 1615
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. Factor: Norm256 by stream BOCM11. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}).
|
QueryToTextByXfDtShowKnnTopSumW2FSumWBodyMinWindowSize
web_production: 1616
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. Factor: Minwindowsize by Stream Body. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) by the expansion top, normalized for the total weight according to the expansion top.
|
QueryToTextByXfDtShowKnnAllSumW2FSumWBodyMinWindowSize
web_production: 1617
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. Factor: Minwindowsize by Stream Body. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) normalized for total weight.
|
QueryToTextByXfDtShowKnnTopSumW2FSumWTextBocm11Norm256
web_production: 1618
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. Factor: Norm256 by stream BOCM11. The total balanced values ​​of the factor multiplied by weight (\ frac {\ sum w_i * (w_i * f_i)} {\ sum w_i}) according to the expansion top.
|
QueryToTextByXfDtShowKnnAllMinW
web_production: 1619
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. The minimum expansion weight.
|
QueryToTextByXfDtShowKnnAllAvgW
web_production: 1620
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. The arithmetic mean of expansion weights.
|
QueryToTextByXfDtShowKnnAllTotalW
web_production: 1621
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. The total weight of the extensions.
|
QueryToTextByXfDtShowKnnBagOfWordsFieldSetBagOfWordsOriginalRequestFraction
web_production: 1622
Linguistic boosting factor. Type of extensions: Querytotextbyxfdtshowknn. Factor: ORIGINALREQUENTFRACTFRACTION OF THE FIELDSETBAGOFWORDS Stream.
|
UnexpectedTrashUrlQuality
web_production: 1656
Neural document model for finding unexpected tin
|
RequestWithoutVerbsTitleBm15K01
web_production: 1713
The initial request with the removal of verbs. It is considered according to the heading of the document. The algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.1.
|
RequestWithoutVerbsFieldSetUTBm15FLogW0K00001
web_production: 1714
The initial request with the removal of verbs. It is considered to be composational stream, consisting of an tokenized Url and a title of a document. The algorithm for aggregation of the scales of words: BM15FLOGW0. Normalization coefficient 0.0001.
|
RequestWithoutVerbsSumWBodyMinWindowSize
web_production: 1715
The initial request with the removal of verbs. It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
DssmPantherTerms
web_production: 1773
|
NeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards
web_production: 1845
The result of the use of a neural model, trained to distinguish long clicks from other events, the input of the model is the ambassadors and bigram meters, calculated by text streams (Title, Body, URL).
|
QfufFilteredByXfOneSeAllMaxFFieldSet2Bm15FLogK0001
web_production: 1847
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. Into aircraft association of the URLs, Title, Body, Correctedctr, Longclick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeAllMaxFFieldSet3BclmWeightedFLogW0K0001
web_production: 1848
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. Rebelled association of streams Title, Body, LongClick, LongClicksp, OneClick. The algorithm for aggregation of the scales of words: BCLMWEIGHTEDFLOGW0. Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeAllMaxFFieldSetUTBm15FLogW0K00001
web_production: 1849
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. It is considered to be composational stream, consisting of an tokenized Url and a title of a document. The algorithm for aggregation of the scales of words: BM15FLOGW0. Normalization coefficient 0.0001.
|
QfufFilteredByXfOneSeAllMaxFTitleBm15K01
web_production: 1850
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation on all extensions. The greatest value of the factor. It is considered according to the heading of the document. The algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.1.
|
QfufFilteredByXfOneSeTopSumWFSumWFieldSet2Bm15FLogK0001
web_production: 1851
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Into aircraft association of the URLs, Title, Body, Correctedctr, Longclick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
QfufFilteredByXfOneSeTopSumWFSumWBodyMinWindowSize
web_production: 1852
Linguistic boosting factor. Type of extensions: QFUFFILTEDBYXFONSE (QFUF, filtered on the DSSM models Xfonese). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
OriginalRequestWordsFilteredByDssmSSHardFieldSet1Bm15FLogK0001
web_production: 1853
The factor for the filtered original request: the DSSM state from the request is calculated without words to the initial request, after which the threshold is cut off. Into aircraft association of the URLs, Title, Body, Links, Correctedctr, LongClick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
OriginalRequestWordsFilteredByDssmSSHardFieldSetUTBm15FLogW0K00001
web_production: 1854
The factor for the filtered original request: the DSSM state from the request is calculated without words to the initial request, after which the threshold is cut off. It is considered to be composational stream, consisting of an tokenized Url and a title of a document. The algorithm for aggregation of the scales of words: BM15FLOGW0. Normalization coefficient 0.0001.
|
DssmCtrEngSsHard
web_production: 1855
DSSM model trained on cross language CTRs using serp similarity hard miner.
|
FractionOfPresentedInTitleWordsWithWeightsByDssmSSHardModel
web_production: 1857
For all words of the request, the weight is calculated by the Query-Mutation method (the distance between the requests in nash and there is no word). The sum of the scales of the words found in the title is taken, divided by the sum of the scales of all words.
|
MaxWeightOfAbsentInTitleWordsWithWeightsByDssmSSHardModel
web_production: 1858
For all words of the request, the weight is calculated by the Query-Mutation method (the distance between the requests in nash and there is no word). Maximum weight is taken among words absent in the title of the document.
|
NeuroTextModelLongClickPredictorByWordAndBigramCountersWithoutTitleWithSSHards
web_production: 1859
The result of the use of a neural model, trained to distinguish long clicks from other events, the input of the model is the ambassadors and bigram meters calculated by text streams (Body, URL).
|
DaterAddTime80Hours
web_production: 1861
It is considered as (80-x) where X is the return of the document in the clock (continuously). Uses the data of the Robotaddtime dates
|
DaterAddTime10Days
web_production: 1862
It is considered as (10-x) where X is the return of the document in days (continuously). Uses the data of the Robotaddtime dates
|
DaterAge10Days
web_production: 1863
The difference between the current date and the date of the document, determined by the Robotaddtime, 1 - the date is equal to the current, 0 - the document of 10 days or more, or the date is not determined
|
XfOneSeKnnAllMaxWFMaxWFieldSet1Bm15FLogK0001
web_production: 1864
Linguistic boosting factor. Type of extensions: XFONESEKNN (closest to the DSSM models trained to predict XFDTSHOW of extension). Aggregation on all extensions. The greatest balanced value of the factor. It is normalized for the maximum weight of expansion. Into aircraft association of the URLs, Title, Body, Links, Correctedctr, LongClick, OneClick, Browserpagerank, Splitdwelltime, SampleperiodDayFrc, SimpleClick, Yabarvisits, Yabartime. The algorithm for aggregation of the scales of words: BM15FLOG (BM15 Aggregation of Logarithm of Construction of Words). Normalization coefficient 0.001.
|
QueryToTextByXfOneSeKnnTopSumWFSumWBodyMinWindowSize
web_production: 1866
Linguistic boosting factor. Type of extensions: QuerytotextByxfoneKnn (Querytotext extensions of Xfoneeseknn extensions). Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
QueryToTextByXfOneSeKnnAllSumWFSumWFieldSet3BclmWeightedFLogW0K0001
web_production: 1867
Linguistic boosting factor. Type of extensions: QuerytotextByxfoneKnn (Querytotext extensions of Xfoneeseknn extensions). Aggregation on all extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Rebelled association of streams Title, Body, LongClick, LongClicksp, OneClick. The algorithm for aggregation of the scales of words: BCLMWEIGHTEDFLOGW0. Normalization coefficient 0.001.
|
ReformulationsLongestClickLogDt
web_production: 1885
DSSM model that predicts the logarithm of the longest click on the Serpa. As negative examples, select Urla from past requests of the same user, and the maximum time between requests is no more than 7 minutes (super -cords for reformulations)
|
ReformulationsLongestClickLogDtEarlyBindingDssm
web_production: 1892
DSSM model with early binding, trained in reformulations, which predicts the logarithm of the longest click on the Serpa.
|
HitContextsDssm
web_production: 1896
Neural network value for contexts of query hits in document text. Predicts relevance-all-8-years. Uses formula ussr-dump-20190719 prs-20190720 all-8-years [t > 0.25] CrossEntropy 20k 0.25 -S 0.8 -Z 1 predictions for learning.
|
DssmReformulationsWithExtensions
web_production: 1898
DSSM model trained on a reformal pool, which in the request, in addition to the request itself, receives 4 extensions of the XFDT with the largest weight
|
DssmFomula8YearsCe25Prediction
web_production: 1906
A model trained to predict an assessment of the USSR-DUMP-20190719 PRS-20190720 ALL-8-YEARS [T> 0.25] Crossentropy 20K 0.25 -s 0.8 -z 1.
|
UnexpectedTrashUrlQualityFresh
web_production: 1909
Neuron document model for finding unexpected tin (for ex -)
|
DssmFomula8YearsCe25PredictionRatings
web_production: 1912
A model trained to predict an assessment of the USSR-DUMP-20190719 PRS-20190720 ALL-8-YEARS [T> 0.25] Crossentropy 20K 0.25 -s 0.8 -z 1 and an educational study on assessments of relevance.
|
QueryInDirectOfferMax
alice_direct_scenario: 0
Max percent of query words in DirectOffer
|
QueryInDirectOfferMean
alice_direct_scenario: 1
Mean percent of query words in DirectOffer
|
QueryInDirectOfferMin
alice_direct_scenario: 2
Min percent of query words in DirectOffer
|
DirectOfferInQueryMax
alice_direct_scenario: 3
Max percent of DirectOffer words in query
|
DirectOfferInQueryMean
alice_direct_scenario: 4
Mean percent of DirectOffer words in query
|
DirectOfferInQueryMin
alice_direct_scenario: 5
Min percent of DirectOffer words in query
|
QueryInDirectOfferPrefixMax
alice_direct_scenario: 6
Max percent of query words in DirectOffer prefix (query length)
|
QueryInDirectOfferPrefixMean
alice_direct_scenario: 7
Mean percent of query words in DirectOffer prefix (query length)
|
QueryInDirectOfferPrefixMin
alice_direct_scenario: 8
Min percent of query words in DirectOffer prefix (query length)
|
QueryInDirectOfferDoublePrefixMax
alice_direct_scenario: 9
Max percent of query words in DirectOffer prefix (2X query length)
|
QueryInDirectOfferDoublePrefixMean
alice_direct_scenario: 10
Mean percent of query words in DirectOffer prefix (2X query length)
|
QueryInDirectOfferDoublePrefixMin
alice_direct_scenario: 11
Min percent of query words in DirectOffer prefix (2X query length)
|
QueryInDirectTitleMax
alice_direct_scenario: 12
Max percent of query words in DirectTitle
|
QueryInDirectTitleMean
alice_direct_scenario: 13
Mean percent of query words in DirectTitle
|
QueryInDirectTitleMin
alice_direct_scenario: 14
Min percent of query words in DirectTitle
|
DirectTitleInQueryMax
alice_direct_scenario: 15
Max percent of DirectTitle words in query
|
DirectTitleInQueryMean
alice_direct_scenario: 16
Mean percent of DirectTitle words in query
|
DirectTitleInQueryMin
alice_direct_scenario: 17
Min percent of DirectTitle words in query
|
QueryInDirectTitlePrefixMax
alice_direct_scenario: 18
Max percent of query words in DirectTitle prefix (query length)
|
QueryInDirectTitlePrefixMean
alice_direct_scenario: 19
Mean percent of query words in DirectTitle prefix (query length)
|
QueryInDirectTitlePrefixMin
alice_direct_scenario: 20
Min percent of query words in DirectTitle prefix (query length)
|
QueryInDirectTitleDoublePrefixMax
alice_direct_scenario: 21
Max percent of query words in DirectTitle prefix (2X query length)
|
QueryInDirectTitleDoublePrefixMean
alice_direct_scenario: 22
Mean percent of query words in DirectTitle prefix (2X query length)
|
QueryInDirectTitleDoublePrefixMin
alice_direct_scenario: 23
Min percent of query words in DirectTitle prefix (2X query length)
|
QueryInDirectInfoMax
alice_direct_scenario: 24
Max percent of query words in DirectInfo
|
QueryInDirectInfoMean
alice_direct_scenario: 25
Mean percent of query words in DirectInfo
|
QueryInDirectInfoMin
alice_direct_scenario: 26
Min percent of query words in DirectInfo
|
DirectInfoInQueryMax
alice_direct_scenario: 27
Max percent of DirectInfo words in query
|
DirectInfoInQueryMean
alice_direct_scenario: 28
Mean percent of DirectInfo words in query
|
DirectInfoInQueryMin
alice_direct_scenario: 29
Min percent of DirectInfo words in query
|
QueryInDirectInfoPrefixMax
alice_direct_scenario: 30
Max percent of query words in DirectInfo prefix (query length)
|
QueryInDirectInfoPrefixMean
alice_direct_scenario: 31
Mean percent of query words in DirectInfo prefix (query length)
|
QueryInDirectInfoPrefixMin
alice_direct_scenario: 32
Min percent of query words in DirectInfo prefix (query length)
|
QueryInDirectInfoDoublePrefixMax
alice_direct_scenario: 33
Max percent of query words in DirectInfo prefix (2X query length)
|
QueryInDirectInfoDoublePrefixMean
alice_direct_scenario: 34
Mean percent of query words in DirectInfo prefix (2X query length)
|
QueryInDirectInfoDoublePrefixMin
alice_direct_scenario: 35
Min percent of query words in DirectInfo prefix (2X query length)
|
QueryInResultTrackNameRatio
alice_music_scenario: 1
Percent of query words in result track name
|
ResultTrackNameInQueryRatio
alice_music_scenario: 2
Percent of result track name words in query
|
QueryInResultAlbumNameRatio
alice_music_scenario: 3
Percent of query words in result album name
|
ResultAlbumNameInQueryRatio
alice_music_scenario: 4
Percent of result album name words in query
|
QueryInResultArtistNameRatio
alice_music_scenario: 5
Percent of query words in result artist name
|
ResultArtistNameInQueryRatio
alice_music_scenario: 6
Percent of result artist name words in query
|
QueryInWizardTitleRatio
alice_music_scenario: 7
Percent of query words in wizard title
|
WizardTitleInQueryRatio
alice_music_scenario: 8
Percent of wizard title words in query
|
QueryInWizardTrackNameRatio
alice_music_scenario: 9
Percent of query words in wizard track name
|
WizardTrackNameInQueryRatio
alice_music_scenario: 10
Percent of wizard track name words in query
|
QueryInWizardAlbumNameRatio
alice_music_scenario: 11
Percent of query words in wizard album name
|
WizardAlbumNameInQueryRatio
alice_music_scenario: 12
Percent of wizard album name words in query
|
QueryInWizardArtistNameRatio
alice_music_scenario: 13
Percent of query words in wizard artist name
|
WizardArtistNameInQueryRatio
alice_music_scenario: 14
Percent of wizard artist name words in query
|
QueryInWizardTrackLyricsRatio
alice_music_scenario: 15
Percent of query words in wizard track lyrics
|
WizardTrackLyricsInQueryRatio
alice_music_scenario: 16
Percent of wizard track lyrics words in query
|
QueryInDocumentsTitleRatioMin
alice_music_scenario: 17
Min percent of query words in documents title
|
QueryInDocumentsTitleRatioMean
alice_music_scenario: 18
Mean percent of query words in documents title
|
QueryInDocumentsTitleRatioMax
alice_music_scenario: 19
Max percent of query words in documents title
|
DocumentsTitleInQueryRatioMin
alice_music_scenario: 20
Min percent of documents title words in query
|
DocumentsTitleInQueryRatioMean
alice_music_scenario: 21
Mean percent of documents title words in query
|
DocumentsTitleInQueryRatioMax
alice_music_scenario: 22
Max percent of documents title words in query
|
QueryInDocumentsSnippetRatioMin
alice_music_scenario: 23
Min percent of query words in documents snippet
|
QueryInDocumentsSnippetRatioMean
alice_music_scenario: 24
Mean percent of query words in documents snippet
|
QueryInDocumentsSnippetRatioMax
alice_music_scenario: 25
Max percent of query words in documents snippet
|
DocumentsSnippetInQueryRatioMin
alice_music_scenario: 26
Min percent of documents snippet words in query
|
DocumentsSnippetInQueryRatioMean
alice_music_scenario: 27
Mean percent of documents snippet words in query
|
DocumentsSnippetInQueryRatioMax
alice_music_scenario: 28
Max percent of documents snippet words in query
|
QueryInTitleMean
alice_search_scenario: 20
Mean percent of query words in title
|
QueryInTitleMin
alice_search_scenario: 21
Min percent of query words in title
|
TitleInQueryMax
alice_search_scenario: 22
Max percent of title words in query
|
TitleInQueryMean
alice_search_scenario: 23
Mean percent of title words in query
|
TitleInQueryMin
alice_search_scenario: 24
Min percent of title words in query
|
PrefixMax
alice_search_scenario: 25
Max percent of query words in title prefix (query length)
|
PrefixMean
alice_search_scenario: 26
Mean percent of query words in title prefix (query length)
|
PrefixMin
alice_search_scenario: 27
Min percent of query words in title prefix (query length)
|
DoublePrefixMax
alice_search_scenario: 28
Max percent of query words in title prefix (2X query length)
|
DoublePrefixMean
alice_search_scenario: 29
Mean percent of query words in title prefix (2X query length)
|
DoublePrefixMin
alice_search_scenario: 30
Min percent of query words in title prefix (2X query length)
|
QueryInHeadlineMax
alice_search_scenario: 31
Max percent of query words in headline
|
QueryInHeadlineMean
alice_search_scenario: 32
Mean percent of query words in headline
|
QueryInHeadlineMin
alice_search_scenario: 33
Min percent of query words in headline
|
HeadlineInQueryMax
alice_search_scenario: 34
Max percent of headline words in query
|
HeadlineInQueryMean
alice_search_scenario: 35
Mean percent of headline words in query
|
HeadlineInQueryMin
alice_search_scenario: 36
Min percent of headline words in query
|
HeadlinePrefixMax
alice_search_scenario: 37
Max percent of query words in headline prefix (query length)
|
HeadlinePrefixMean
alice_search_scenario: 38
Mean percent of query words in headline prefix (query length)
|
HeadlinePrefixMin
alice_search_scenario: 39
Min percent of query words in headline prefix (query length)
|
DoubleHeadlinePrefixMax
alice_search_scenario: 40
Max percent of query words in headline prefix (2X query length)
|
DoubleHeadlinePrefixMean
alice_search_scenario: 41
Mean percent of query words in headline prefix (2X query length)
|
DoubleHeadlinePrefixMin
alice_search_scenario: 42
Min percent of query words in headline prefix (2X query length)
|
ItemSelectorConfidence
alice_video_scenario: 14
Confidence of item selector in current gallery
|
QueryInItemNameMax
alice_video_scenario: 15
Max percent of query words in ItemName
|
QueryInItemNameMean
alice_video_scenario: 16
Mean percent of query words in ItemName
|
QueryInItemNameMin
alice_video_scenario: 17
Min percent of query words in ItemName
|
ItemNameInQueryMax
alice_video_scenario: 18
Max percent of ItemName words in query
|
ItemNameInQueryMean
alice_video_scenario: 19
Mean percent of ItemName words in query
|
ItemNameInQueryMin
alice_video_scenario: 20
Min percent of ItemName words in query
|
QueryInItemNamePrefixMax
alice_video_scenario: 21
Max percent of query words in ItemName prefix (query length)
|
QueryInItemNamePrefixMean
alice_video_scenario: 22
Mean percent of query words in ItemName prefix (query length)
|
QueryInItemNamePrefixMin
alice_video_scenario: 23
Min percent of query words in ItemName prefix (query length)
|
QueryInItemNameDoublePrefixMax
alice_video_scenario: 24
Max percent of query words in ItemName prefix (2X query length)
|
QueryInItemNameDoublePrefixMean
alice_video_scenario: 25
Mean percent of query words in ItemName prefix (2X query length)
|
QueryInItemNameDoublePrefixMin
alice_video_scenario: 26
Min percent of query words in ItemName prefix (2X query length)
|
QueryInItemDescriptionMax
alice_video_scenario: 27
Max percent of query words in ItemDescription
|
QueryInItemDescriptionMean
alice_video_scenario: 28
Mean percent of query words in ItemDescription
|
QueryInItemDescriptionMin
alice_video_scenario: 29
Min percent of query words in ItemDescription
|
ItemDescriptionInQueryMax
alice_video_scenario: 30
Max percent of ItemDescription words in query
|
ItemDescriptionInQueryMean
alice_video_scenario: 31
Mean percent of ItemDescription words in query
|
ItemDescriptionInQueryMin
alice_video_scenario: 32
Min percent of ItemDescription words in query
|
QueryInItemDescriptionPrefixMax
alice_video_scenario: 33
Max percent of query words in ItemDescription prefix (query length)
|
QueryInItemDescriptionPrefixMean
alice_video_scenario: 34
Mean percent of query words in ItemDescription prefix (query length)
|
QueryInItemDescriptionPrefixMin
alice_video_scenario: 35
Min percent of query words in ItemDescription prefix (query length)
|
QueryInItemDescriptionDoublePrefixMax
alice_video_scenario: 36
Max percent of query words in ItemDescription prefix (2X query length)
|
QueryInItemDescriptionDoublePrefixMean
alice_video_scenario: 37
Mean percent of query words in ItemDescription prefix (2X query length)
|
QueryInItemDescriptionDoublePrefixMin
alice_video_scenario: 38
Min percent of query words in ItemDescription prefix (2X query length)
|
ItemSelectorConfidenceByName
alice_video_scenario: 40
Confidence of item selector by name in current gallery
|
ItemSelectorConfidenceByNumber
alice_video_scenario: 41
Confidence of item selector by number in current gallery
|
AbsolutePLM
collections_production: 3
|
Bclm
collections_production: 4
|
TxtBm25Sy
collections_production: 5
|
DocLen
collections_production: 6
|
Bclm2
collections_production: 7
|
Tocm
collections_production: 8
|
TitleTrigramsTitle
collections_production: 9
|
TextBM25_Fm_W1
collections_production: 10
|
TxtBm25Ex
collections_production: 11
|
TextBM25
collections_production: 12
|
TextBM25_Sy_W1
collections_production: 13
|
TxtHeadSy
collections_production: 14
|
YmwFull2
collections_production: 15
|
TxtHeadEx
collections_production: 16
|
TxtHead
collections_production: 17
|
TxtBreakSy
collections_production: 18
|
TxtBreakEx
collections_production: 19
|
RussianSrcOwnersShare
images_cbir: 8
Fraction of hosts with Russian language
|
GruesomeCombined
images_l1: 93
The result of the aggregated tin classifier is used on average to determine tin queries
|
RussianSrcOwnersShare
images_market_l4: 366
Fraction of hosts with Russian language
|
RussianSrcOwnersShare
images_market: 372
Fraction of hosts with Russian language
|
VwChildPorn
images_new_runtime_doc_features: 1
The value of the DP classifier is used to filter on average
|
VwDwellTime
images_new_runtime_doc_features: 23
The result of the text classifier of long views via VowPal Wabbit
|
GruesomeCombined
images_new_runtime_doc_features: 28
The result of the aggregated tin classifier is used on average to determine tin queries
|
DocIdfSumFixed
images_new_runtime_doc_features: 49
Previous factors - fixed
|
VwSuggestive
images_new_runtime_doc_features: 50
The result of the Suggestive text classifier via VowPal Wabbit
|
VwPorno2
images_new_runtime_doc_features: 59
The result of the text classifier of porn viapal wabbit
|
VwGruesome2
images_new_runtime_doc_features: 60
The result of the text classifier of tin according to Vowpal Wabbit
|
ImagePorno4
images_new_runtime_doc_features: 85
Image porn classifier output
|
TurkishSrcOwnersShare
images_new_runtime_doc_features: 86
Fraction of hosts with Turkish language
|
RussianSrcOwnersShare
images_new_runtime_doc_features: 88
Fraction of hosts with Russian language
|
VwChildPorn
images_production: 1
The value of the DP classifier is used to filter on average
|
ImageLangsFound
images_production: 12
|
ImageNearbyTextBm15MaxK3MaxMeta
images_production: 31
Feature from utracker
|
ImageOwnersWithAllWords
images_production: 35
|
ImageOwnersWithHitsShare
images_production: 36
|
ImageOwnersWithAllWordsShare
images_production: 37
|
VwDwellTime
images_production: 42
The result of the text classifier of long views via VowPal Wabbit
|
BFexact
images_production: 47
There is an exact form of all words of the request in the text/lincers
|
HasAllWordsTRSy
images_production: 56
The document has all the words of the request (with an accuracy to a synonym)
|
GruesomeCombined
images_production: 58
The result of the aggregated tin classifier is used on average to determine tin queries
|
ImageLinksWithAllWords
images_production: 85
|
LargestSyInexactGroup
images_production: 99
The share of the request, covered by the longest group consisting of any hits (including word forms and synonyms). Possibly with a pass, addition or replacement of a word
|
DocIdfSumFixed
images_production: 101
Previous factors - fixed
|
ImageDBM
images_production: 102
BM25 - like factor, but with custom coefficients
|
VwSuggestive
images_production: 104
The result of the Suggestive text classifier via VowPal Wabbit
|
QfufAllMaxFTitleBclmMixPlainKE5
images_production: 111
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest value of the factor. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
ImageMaxWordsPairShare
images_production: 126
|
ImageMinLemmaWordsShare
images_production: 127
|
ImageMaxLemmaAndSynonymWordsShare
images_production: 128
|
ImageMaxWordsPairSumExpWeight
images_production: 131
|
MaxLeftAndRightBigramsWeight
images_production: 132
|
ImageDBM25
images_production: 133
|
ImageIBocm
images_production: 153
|
NumWordsTRSy
images_production: 164
The percentage of the words of the request in the document (with an accuracy to a synonym)
|
ImagePhraseLRSyn
images_production: 165
|
VwPorno2
images_production: 176
The result of the text classifier of porn viapal wabbit
|
VwGruesome2
images_production: 177
The result of the text classifier of tin according to Vowpal Wabbit
|
HasAllWordsTRSyAvgMeta
images_production: 186
The document has all the words of the request (accurate to synonym) - Top Average
|
TextWord1TFRatioAvgMeta
images_production: 187
Ratio of word_1 hits to all hits - top average
|
TextWord2TFRatioAvgMeta
images_production: 188
Ratio of word_2 hits to all hits - top average
|
ImageIBocmDouble
images_production: 206
|
TextWord3TFRatioAvgMeta
images_production: 216
Ratio of word_3 hits to all hits - top avg
|
WordPairsWeight
images_production: 254
|
TextWord1PosShiftAvgMeta
images_production: 260
Distance from average position of word 1 to position 1 - top avg
|
TextWord1PosShift
images_production: 306
Distance from average position of word 1 to position 1
|
TextWord2PosShift
images_production: 307
Distance from average position of word 2 to position 2
|
TextMinWordPosShift
images_production: 308
Min of distances of average position of word_i to prosition i
|
TextMaxWeightedWordPosVariance
images_production: 309
Max weighted variance of position of word_i for all i
|
TextWord1ExactRatio
images_production: 310
Ratio of EQUAL_BY_STRING hits of word_1 to all hits of word_2
|
TextWord2ExactRatio
images_production: 311
Ratio of EQUAL_BY_STRING hits of word_2 to all hits of word_2
|
TextWord3ExactRatio
images_production: 312
Ratio of EQUAL_BY_STRING hits of word_3 to all hits of word_3
|
TextGlobalExactRatio
images_production: 313
Ratio of EQUAL_BY_STRING hits to all hits for all words
|
TextWord1TFRatio
images_production: 314
Ratio of word_1 hits to all hits
|
TextWord2TFRatio
images_production: 315
Ratio of word_2 hits to all hits
|
TextWord3TFRatio
images_production: 316
Ratio of word_3 hits to all hits
|
TextAvgDistForTwoImportantStr
images_production: 317
Average distance between EQUAL_BY_STRING hits of two max-weighted words in one break
|
TextMaxPrefixLenStr
images_production: 318
Max prefix length(in words) of exact(by position and form) match
|
TextAvgDist12Syn
images_production: 319
Average distance between word_1 and word_2 in a break
|
TextMaxAvgDistSyn
images_production: 320
Max average distance between word_i and word_i+1 in a break
|
TextMaxPrefixLenSyn
images_production: 321
Max prefix length(in words) of exact(by position) matches
|
ImagePorno4
images_production: 338
Image porn classifier output
|
TurkishSrcOwnersShare
images_production: 339
Fraction of hosts with Turkish language
|
RussianSrcOwnersShare
images_production: 341
Fraction of hosts with Russian language
|
ImageAltTitleBocm
images_production: 361
|
ImageAltTitleBocmDouble
images_production: 362
|
ImageNonAltTitleBocm
images_production: 363
|
ImageNonAltTitleBocmDouble
images_production: 364
|
ImageAltTitleValueWcmAvg
images_production: 366
Feature from utracker
|
ImageAltTitleValueWcmPrediction
images_production: 367
Feature from utracker
|
ImageAltTitleBm15MaxK3
images_production: 368
Feature from utracker
|
ImageAltTitleBclmPlainW1K3
images_production: 369
Feature from utracker
|
ImageAltTitleBclmWeightedK3
images_production: 370
Feature from utracker
|
ImageAltTitleBocmWeightedW1K3
images_production: 371
Feature from utracker
|
ImageAltTitleBocmWeightedMaxK1
images_production: 372
Feature from utracker
|
ImageAltTitleBm15CoverageK3
images_production: 373
Feature from utracker
|
ImageAltTitleBclmWeightedV2K3
images_production: 374
Feature from utracker
|
ImageAltTitleBocmDoubleK5
images_production: 375
Feature from utracker
|
ImageNearbyTextValueWcmAvg
images_production: 376
Feature from utracker
|
ImageNearbyTextValueWcmPrediction
images_production: 377
Feature from utracker
|
ImageNearbyTextBm15MaxK3
images_production: 378
Feature from utracker
|
ImageNearbyTextBclmPlainW1K3
images_production: 379
Feature from utracker
|
ImageNearbyTextBclmWeightedK3
images_production: 380
Feature from utracker
|
ImageNearbyTextBocmWeightedW1K3
images_production: 381
Feature from utracker
|
ImageNearbyTextBocmWeightedMaxK1
images_production: 382
Feature from utracker
|
ImageNearbyTextBocmDoubleK5
images_production: 383
Feature from utracker
|
ImageTextAnnotationMatchPredictionWeighted
images_production: 384
Feature from utracker
|
ImageTextValueWcmAvg
images_production: 385
Feature from utracker
|
ImageTextValueWcmPrediction
images_production: 386
Feature from utracker
|
ImageTextBm15StrictK2
images_production: 387
Feature from utracker
|
ImageTextBm15MaxK3
images_production: 388
Feature from utracker
|
ImageTextBclmPlainW1K3
images_production: 389
Feature from utracker
|
ImageTextBclmWeightedK3
images_production: 390
Feature from utracker
|
ImageTextBocmWeightedW1K3
images_production: 391
Feature from utracker
|
ImageTextBocmWeightedMaxK1
images_production: 392
Feature from utracker
|
ImageTextBm15K9
images_production: 393
Feature from utracker
|
ImageTextBm15CoverageK3
images_production: 394
Feature from utracker
|
ImageTextBm15CoverageV2K3
images_production: 395
Feature from utracker
|
ImageTextBm15CoverageV4K3
images_production: 396
Feature from utracker
|
ImageTextBclmPlainK5
images_production: 397
Feature from utracker
|
ImageTextBclmWeightedV2K3
images_production: 398
Feature from utracker
|
ImageTextBclmMixPlainW1K1
images_production: 399
Feature from utracker
|
ImageTextBocmPlain
images_production: 400
Feature from utracker
|
ImageTextBocmWeightedK5
images_production: 401
Feature from utracker
|
ImageTextBocmWeightedK7
images_production: 402
Feature from utracker
|
ImageTextBocmWeightedK9
images_production: 403
Feature from utracker
|
ImageTextBocmWeightedV4W1K2
images_production: 404
Feature from utracker
|
ImageTextBocmDoubleK5
images_production: 405
Feature from utracker
|
AnnL1BocmPlain
images_production: 406
Bocm plain used in fastrank for all ann hits
|
AnnL1BocmDouble
images_production: 407
Bocm double used in fastrank for all ann hits
|
AnnL1BfExact
images_production: 408
BfExact used in fastrank for all ann hits
|
AnnL1ImageDbm25
images_production: 409
ImageDbm25 used in fastrank for all ann hits
|
AnnL1LargestSyInexactGroup
images_production: 410
LargestSyInexactGroup used in fastrank for all ann hits
|
ImageOwnersWithHitsShareAvgMeta
images_production: 425
|
TextWordWeightSum
images_production: 439
Sum of weights for found words in indexkey/inv
|
XfImgClicksAllMaxFTitleWordCoverageForm
images_production: 468
Linguistic boosting factor. Type of extensions: XFIMGCLICS. Aggregation on all extensions. The greatest value of the factor. It is considered according to the heading of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
XfImgClicksAllMaxWFTitleExactQueryMatchAvgValue
images_production: 469
Linguistic boosting factor. Type of extensions: XFIMGCLICS. Aggregation on all extensions. The greatest balanced value of the factor. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
QfufAllMaxWFTitleBclmPlaneProximity1Bm15W0Size1K0001
images_production: 631
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest balanced value of the factor. It is considered according to the heading of the document. The BCLMPLANEPROXIMITY15W0SIZE1 algorithm: uses BCLM with free weighing if there are several words, if the word is one, then the sum of hits is used as a type of coincidence. Normalization coefficient 0.001.
|
QfufTopMinWFSumWTitleBclmMixPlainKE5
images_production: 632
Linguistic boosting factor. Type of extensions: QFUF. Aggregation by TOP-10 (by the value of the factor) extensions. Nimenest, balanced meaning of the factor. Normalized for the total weight of extensions. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
QfufAllMaxWFMaxWTitleExactQueryMatchAvgValue
images_production: 634
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest balanced value of the factor. It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
XfDtShowAllMaxWFMaxWTitleExactQueryMatchAvgValue
images_production: 657
Linguistic boosting factor. Type of extensions: XFDTSHOW. Aggregation on all extensions. The greatest balanced value of the factor. It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
AllMatchedWordWeightsSumText
images_production: 686
The normalized amount of the scales of the words of the request that met in the text of the document.
|
StringMatchedWordWeightsSumText
images_production: 688
The normalized amount of the scales of the words of the request that Equal_by_String in the text of the document.
|
StringMatchedWordWeightsSumAnn
images_production: 689
The normalized amount of the scales of the words of the request that Equal_by_String in the annotations to the document.
|
AllMatchedWordFiltrationModelWeightsSumText
images_production: 690
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in the text of the document.
|
AllMatchedWordFiltrationModelWeightsSumAnn
images_production: 691
The normalized scales for the IFILTRETRATIONMODEL words of the request that met in anotages to the document.
|
StringMatchedWordFiltrationModelWeightsSumText
images_production: 692
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_String in the text of the document.
|
StringMatchedWordFiltrationModelWeightsSumAnn
images_production: 693
The normalized scales for the IFILTRETRATIONMODEL Words of the request, which Equal_by_String in the annotations to the document.
|
VwChildPorn
images_recommendations: 1
The value of the DP classifier is used to filter on average
|
VwDwellTime
images_recommendations: 24
The result of the text classifier of long views via VowPal Wabbit
|
GruesomeCombined
images_recommendations: 29
The result of the aggregated tin classifier is used on average to determine tin queries
|
DocIdfSumFixed
images_recommendations: 50
Previous factors - fixed
|
VwSuggestive
images_recommendations: 51
The result of the Suggestive text classifier via VowPal Wabbit
|
VwPorno2
images_recommendations: 70
The result of the text classifier of porn viapal wabbit
|
VwGruesome2
images_recommendations: 71
The result of the text classifier of tin according to Vowpal Wabbit
|
ImagePorno4
images_recommendations: 114
Image porn classifier output
|
TurkishSrcOwnersShare
images_recommendations: 115
Fraction of hosts with Turkish language
|
RussianSrcOwnersShare
images_recommendations: 117
Fraction of hosts with Russian language
|
KinopoiskSuggestAllMaxWFMaxWTitleExactQueryMatchAvgValue
kp_text_machine: 0
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestTopMinWFMaxWTitleBclmMixPlainKE5
kp_text_machine: 1
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; The maximum weight of the extension. It is considered according to the heading of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
KinopoiskSuggestTopSumW2FSumWTitleExactQueryMatchAvgValue
kp_text_machine: 2
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: an abstract by square of expansion weight, multiplied by the value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxWFTitleExactQueryMatchAvgValue
kp_text_machine: 3
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxFTitleAttenV1Bm15K001
kp_text_machine: 4
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the heading of the document. The weight of the hit is multiplied by 1/ (1 + the position of the word in the sentence) an algorithm for aggregation of the scales of words: BM15. Normalization coefficient 0.01.
|
KinopoiskSuggestAllMaxFTitleWordCoverageExact
kp_text_machine: 5
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the heading of the document. The degree of covering the words of the request in the exact form.
|
KinopoiskSuggestTopMinWFTitleWordCoverageForm
kp_text_machine: 6
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; It is considered according to the heading of the document. The degree of coating of the words of the request is accurate to the form (without synonyms).
|
KinopoiskSuggestAllMaxWFSumWTitleExactQueryMatchAvgValue
kp_text_machine: 7
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllSumW2FSumWTitleExactQueryMatchAvgValue
kp_text_machine: 8
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: an abstract by square of expansion weight, multiplied by the value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
KinopoiskSuggestAllMaxWFMaxWTitleCosineMatchMaxPrediction
kp_text_machine: 9
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation on all extensions. Type of aggregation for extensions: the greatest balanced value of the factor; It is normalized for the maximum weight of expansion. It is considered according to the heading of the document. Algorithm Cosinematchmaxpredical.
|
KinopoiskSuggestTopMinWFSumWTitleExactQueryMatchAvgValue
kp_text_machine: 10
Linguistic boosting factor. Type of extensions: Kinopoisksuggest (extensions of the textual orgate to text saddles). Aggregation by TOP-10 (by the value of the factor) extensions. Type of aggregation for extensions: the smallest balanced value of the factor; normalized for the total weight of extensions. It is considered according to the heading of the document. The average weight of the anntations among those in which the request was an accurate tuning.
|
IsPorno
neural_network_over_dssm_factors: 0
Document from porn kitski
|
IsFake
neural_network_over_dssm_factors: 2
Fast document
|
IsEShop
neural_network_over_dssm_factors: 3
Commercial page (Classifier Savina)
|
HasPayments
neural_network_over_dssm_factors: 6
The page has a about 'payment SMS'.
|
EshopValue
neural_network_over_dssm_factors: 9
Stage of the page
|
PornoValue
neural_network_over_dssm_factors: 10
Pornography of the page
|
IsPornoAdvert
neural_network_over_dssm_factors: 11
On the Porn Advertising page
|
Poetry
neural_network_over_dssm_factors: 12
The poetry of the document
|
PoetryQuad
neural_network_over_dssm_factors: 13
The maximum poetry of the quatrain
|
SynS1
neural_network_over_dssm_factors: 14
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap1
neural_network_over_dssm_factors: 15
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap2
neural_network_over_dssm_factors: 16
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynPercentBadWordPairs
neural_network_over_dssm_factors: 18
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
SynNumBadWordPairs
neural_network_over_dssm_factors: 19
The proportion of bad steam among all found in the table: Z/(X+1), where Z is the number of bad couples in the text, and X is (http://wiki.yandex-team.ru/evgenijgrechnikov/testsynonimizers of 2000-navigable )) steam
|
NumLatinLetters
neural_network_over_dssm_factors: 20
The number of Latin letters in the text (not counting the markings) driven into [0.1] formula n/(n+100)
|
RusWordsInText
neural_network_over_dssm_factors: 22
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
RusWordsInTitle
neural_network_over_dssm_factors: 23
The number of words of the Russian language in the title
|
MeanWordLength
neural_network_over_dssm_factors: 24
The average length of the word
|
PercentWordsInLinks
neural_network_over_dssm_factors: 25
The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
PercentVisibleContent
neural_network_over_dssm_factors: 26
The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
PercentFreqWords
neural_network_over_dssm_factors: 27
The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
PercentUsedFreqWords
neural_network_over_dssm_factors: 28
The number used in the text 500 of the most popular words of the language, divided by 500
|
TrigramsProb
neural_network_over_dssm_factors: 29
Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
TrigramsCondProb
neural_network_over_dssm_factors: 30
Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
NumeralsPortion
neural_network_over_dssm_factors: 31
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
ParticlesPortion
neural_network_over_dssm_factors: 32
The share of particles
|
AdjPronounsPortion
neural_network_over_dssm_factors: 33
The share of pronoun adjectives
|
AdvPronounsPortion
neural_network_over_dssm_factors: 34
The proportion of pronoun nouns
|
VerbsPortion
neural_network_over_dssm_factors: 35
The share of verbs
|
FemAndMasNounsPortion
neural_network_over_dssm_factors: 36
The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'hummingbirds' are an example of an indefinite kind that can be determined in two ways, 'Alexander' is homonymy).
|
LongestText
neural_network_over_dssm_factors: 37
The size of the largest text segment (from the factor [18] puretext)
|
SegmentAuxAlphasInText
neural_network_over_dssm_factors: 44
Number of letters in the AUX segment
|
SegmentAuxSpacesInText
neural_network_over_dssm_factors: 45
The number of spaces in the AUX segment
|
SegmentContentCommasInText
neural_network_over_dssm_factors: 46
The number of commas in the Content segment
|
StaticTitleBM25Ex
neural_network_over_dssm_factors: 48
BM25 page title by its text
|
TrashAdv
neural_network_over_dssm_factors: 49
The greasy of the page
|
Soft404
neural_network_over_dssm_factors: 55
Page - '404' (share of tokens '404' in relation to the total number of tokens on the page)
|
PureText
neural_network_over_dssm_factors: 58
Long text without links.
|
RusLang
neural_network_over_dssm_factors: 66
The language of the document is Russian.
|
AuraDocLogShared
neural_network_over_dssm_factors: 77
Logarithm of the number of shingles on which this document is not unique
|
AuraDocLogAuthor
neural_network_over_dssm_factors: 78
Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
AuraDocLogOrigin
neural_network_over_dssm_factors: 79
Logarithm of the number of shingles in the document added by the owner of the site as original texts in ((http://wiki.yandex-team.ru/jandekspoisk/jekosistema/marketingPr/webmasters/plan/vtorcontect of originality plugin)). It does not participate in the formula, it is needed to disconnect the takes
|
AuraDocMeanSharedWeight
neural_network_over_dssm_factors: 80
The average weight of non-ugly shingles of this document
|
AuraDocMeanFltAuthorSource
neural_network_over_dssm_factors: 81
The average filtered number of sources of authorship of the document. It does not participate in the formula, it is needed to disconnect the takes
|
HasUserReviews
neural_network_over_dssm_factors: 82
The document contains user review/comment
|
HasDownloadLinkOnFile
neural_network_over_dssm_factors: 83
The document has a direct link to the file
|
HasDownloadLinkOnFileHosting
neural_network_over_dssm_factors: 84
The document has a link to filehosting
|
SegmentWordPortionFromMainContent
neural_network_over_dssm_factors: 86
The share of the words of the document from the segments with Score> 2.
|
TextFeatures
neural_network_over_dssm_factors: 119
The quality of the text. It is considered a rather complex formula
|
TextLike
neural_network_over_dssm_factors: 120
Text quality (classifier Alekseev)
|
DocLen
neural_network_over_dssm_factors: 121
Document length in sentences
|
IsHTML
neural_network_over_dssm_factors: 123
Document type - HTML
|
EngLang
neural_network_over_dssm_factors: 136
Document language - English
|
CyrLang
neural_network_over_dssm_factors: 137
The language of the document is Cyrillic
|
LanguagePopularity
neural_network_over_dssm_factors: 138
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
DaterStatsYearNormLikelihood
neural_network_over_dssm_factors: 140
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
DaterStatsAverageSourceSegment
neural_network_over_dssm_factors: 141
The arithmetic mean position of dates in the document. Temporarily disconnected
|
MaxD30Long
personalization: 0
Max cosine similarity between document and user history with clicks dwelltime > 30sec, by realtime user_history
|
MaxD60Long
personalization: 1
Max cosine similarity between document and user history with clicks dwelltime > 60sec, by realtime user_history
|
MaxD120Long
personalization: 2
Max cosine similarity between document and user history with clicks dwelltime > 120sec, by realtime user_history
|
MaxD180Long
personalization: 3
Max cosine similarity between document and user history with clicks dwelltime > 180sec, by realtime user_history
|
MaxD360Long
personalization: 4
Max cosine similarity between document and user history with clicks dwelltime > 360sec, by realtime user_history
|
MaxD30Short
personalization: 5
Max cosine similarity between document and user history with clicks dwelltime <= 30sec, by realtime user_history
|
MaxD60Short
personalization: 6
Max cosine similarity between document and user history with clicks dwelltime <= 60sec, by realtime user_history
|
MaxD120Short
personalization: 7
Max cosine similarity between document and user history with clicks dwelltime <= 120sec, by realtime user_history
|
MaxD180Short
personalization: 8
Max cosine similarity between document and user history with clicks dwelltime <= 180sec, by realtime user_history
|
MaxD360Short
personalization: 9
Max cosine similarity between document and user history with clicks dwelltime <= 360sec, by realtime user_history
|
TopavgS5D30Long
personalization: 10
Avg by top-5 maximum cosine similarity between document and user history with clicks dwelltime > 30sec, by realtime user_history
|
TopavgS5D60Long
personalization: 11
Avg by top-5 maximum cosine similarity between document and user history with clicks dwelltime > 60sec, by realtime user_history
|
TopavgS5D120Long
personalization: 12
Avg by top-5 maximum cosine similarity between document and user history with clicks dwelltime > 120sec, by realtime user_history
|
TopavgS5D180Long
personalization: 13
Avg by top-5 maximum cosine similarity between document and user history with clicks dwelltime > 180sec, by realtime user_history
|
TopavgS5D360Long
personalization: 14
Avg by top-5 maximum cosine similarity between document and user history with clicks dwelltime > 360sec, by realtime user_history
|
TopavgS10D30Long
personalization: 15
Avg by top-10 maximum cosine similarity between document and user history with clicks dwelltime > 30sec, by realtime user_history
|
TopavgS10D60Long
personalization: 16
Avg by top-10 maximum cosine similarity between document and user history with clicks dwelltime > 60sec, by realtime user_history
|
TopavgS10D120Long
personalization: 17
Avg by top-10 maximum cosine similarity between document and user history with clicks dwelltime > 120sec, by realtime user_history
|
TopavgS10D180Long
personalization: 18
Avg by top-10 maximum cosine similarity between document and user history with clicks dwelltime > 180sec, by realtime user_history
|
TopavgS10D360Long
personalization: 19
Avg by top-10 maximum cosine similarity between document and user history with clicks dwelltime > 360sec, by realtime user_history
|
TopavgS15D30Long
personalization: 20
Avg by top-15 maximum cosine similarity between document and user history with clicks dwelltime > 30sec, by realtime user_history
|
TopavgS15D60Long
personalization: 21
Avg by top-15 maximum cosine similarity between document and user history with clicks dwelltime > 60sec, by realtime user_history
|
TopavgS15D120Long
personalization: 22
Avg by top-15 maximum cosine similarity between document and user history with clicks dwelltime > 120sec, by realtime user_history
|
TopavgS15D180Long
personalization: 23
Avg by top-15 maximum cosine similarity between document and user history with clicks dwelltime > 180sec, by realtime user_history
|
TopavgS15D360Long
personalization: 24
Avg by top-15 maximum cosine similarity between document and user history with clicks dwelltime > 360sec, by realtime user_history
|
DssmHaveShowsUrlTitleKeywordsPrediction
robot_selection_rank: 3
|
DssmHaveClicksUrlTitleKeywordsPrediction
robot_selection_rank: 4
|
DssmLogClicksUrlTitleKeywordsPrediction
robot_selection_rank: 5
|
WebTRp1
video_production: 2
Stript priority for TR is a text priority - there are all the words of the request somewhere in the document (while they pass contextual restrictions on the request, for example, both words DB in one sentence).
|
WebTRtitle
video_production: 3
The presence of an accurate phrase (request text) in the header (more precisely, in the first sentence of the document).
|
WebSoftAndOk
video_production: 7
The document passed Softand on the restrictions of the syntactic sorcerer. Only for documents with textual relevance. For monosyllabic requests, always 1.
|
WebPassageLegacyTR
video_production: 8
Text relevance (maxfreq is the frequency of the most frequent word that makes sense of the length of the document).
|
WebTRDocQuorum
video_production: 10
The weight of the words of the request that is in the text.
|
DssmL2WebReformulationsDt
video_production: 99
Logdwelltime by the VEB model DSSM, trained in reformulations. It is also used in the ranking of ether.
|
DssmL2VideoReformulationsWin
video_production: 103
Win (click longer than 60 seconds) on the DSSM video model trained in reformulations.
|
QfufAllMaxFBodyWordCoverageExact
video_production: 126
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest value of the factor. It is considered according to the contents of the document. The degree of covering the words of the request in the exact form.
|
QfufTopMinWFSumWBodyBclmMixPlainKE5
video_production: 127
Linguistic boosting factor. Type of extensions: QFUF. Aggregation by TOP-10 (by the value of the factor) extensions. Nimenest, balanced meaning of the factor. Normalized for the total weight of extensions. It is considered according to the contents of the document. The algorithm for aggregation of words weights is BCLMMIXPLAIN: a linear mixture of annotation BCLM weights and balanced Positionless weights of the word, then the former meters are aggregated through BM15. Normalization coefficient 10^(-5).
|
Bclm2
video_production: 155
The factor about the proximity of the request and text of the document. It differs from BCLM in that the weights of all words are considered the same. It is also used in the ranking of ether.
|
DBMNumbers
video_production: 157
DBM (BM25 with machine-like words) exclusively in numbers.
|
BocmFull
video_production: 178
Simple BOCM gluing Links.
|
FirstHitSentenceBocmFull
video_production: 179
BOCM for gluing Links, calculated only on the first sentences with hits and all forms of entering are considered equivalent.
|
BestFirstHitSentenceTocm
video_production: 180
The best BOCM among all links, such as Title (analogue of TOCM), calculated as follows: only sentences with hits are considered and all forms of entry are considered equivalent.
|
DbmVideoNumbers
video_production: 183
The new DBM only in terms of gluing links (differs from DBMNUMBERS [157] only constants and completely clogs it).
|
TitleTrigramsInQuery
video_production: 219
Coating trigrams of Title trigrams. It is also used in the ranking of ether.
|
UrlTrigramsInQuery
video_production: 220
Coating trigrams of a query trigrams of Urla.
|
QfufAllSumW2FSumWTitlePerWordCMMaxPredictionMin
video_production: 248
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The amount by the square of the expansion, multiplied by the value of the factor. Normalized for the total weight of extensions. It is considered according to the heading of the document. PerwordCmmaxMatchMin algorithm: At least according to the maximum of the CMMAXMATCH weight abstracts.
|
QfufTopSumWFSumWTitleWordCoverageExact
video_production: 253
Linguistic boosting factor. Type of extensions: QFUF. Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the heading of the document. The degree of covering the words of the request in the exact form.
|
QfufAllSumFCountBodyAllWcmMatch80AvgValue
video_production: 280
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The sum of the scales of factors. The number of extensions. It is considered according to the contents of the document. The sum of the scales of words, balanced by the weight of the annotation, is normalized for the sum of the scales of words. Only annotations are calculated, on which the sum of words of words is more than 80%.
|
QfufTopSumWFSumWBodyQueryPrefixMatchOriginalWordValue
video_production: 323
Linguistic boosting factor. Type of extensions: QFUF. Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. It is considered according to the contents of the document. The maximum weight of the annotation, the prefix of which contains the words of the request in the same order (with an accuracy to the form).
|
MetaWebTRDocQuorum
video_production: 374
The average value of the webtrdocquorum factor is average.
|
MetaBocmFull
video_production: 387
The average value of the Bocmfull factor is average.
|
MetaBestFirstHitSentenceTocm
video_production: 388
The average value of the Bestfirsthitsentencetocm factor is average.
|
MetaAvgBocmFull
video_production: 459
The average value of the Bocmfull factor in PRS
|
MetaAvgBestFirstHitSentenceTocm
video_production: 460
The average value of the Bestfirsthitsentencetocm factor in PRS
|
MetaRmsBocmFull
video_production: 471
The mid -sequential deviation of the Bocmfull factor in PRS
|
MetaRmsTitleTrigramsInQuery
video_production: 472
The mid -sequential deviation of the Titlerigramsinquery factor in PRS
|
MetaVarianceTitleCovering
video_production: 485
COF-T Variations Pyrson Dul factor Titlecovering in PRs
|
MetaVarianceTitleTrigramsInQuery
video_production: 487
CoEF-T of Pieron Variation for the Titlerigramsinquery factor in PRS
|
MetaResidWebTRDocQuorum
video_production: 496
Resid for Webtrdocquorum Factor in PRS
|
MetaResidTitleCovering
video_production: 500
Resid for Titlecovering Factor in PRS
|
MetaFractTitleTrigramsInQuery
video_production: 503
FRACT for the Titlerigramsinquery factor in PRS
|
DssmL3WebLogDwellTime
video_production: 559
Logdwelltime by the VEB model DSSM. It is also used in the ranking of ether.
|
DssmL2WebLogDwellTime
video_production: 585
Logdwelltime by the VEB model DSSM.
|
TitleQuerySimilarityByClicks
video_production: 593
The similarity of the T2q vehicles of the Title and the request a la Klakhman, trained by clicks
|
DssmL3VideoDeepClickPlayerDepth
video_production: 596
DSSM with PlayerDepth Target on the deep click pool video. It is also used in the ranking of ether.
|
DssmL2VideoDcPlayerDepth
video_production: 599
DSSM with PlayerDepth Target on the deep click of a video at L2 Stages.
|
OriginalRequestBodyAvgPerTrigramAvgValueAny
video_production: 729
The factor for the original request. It is considered according to the contents of the document. Algorithm: AVGPERGRAMAMAVGVALueany.
|
OriginalRequestBodyBclmPlaneProximity1Bm15W0Size1K001
video_production: 730
The factor for the original request. It is considered according to the contents of the document. The BCLMPLANEPROXIMITY15W0SIZE1 algorithm: uses BCLM with free weighing if there are several words, if the word is one, then the sum of hits is used as a type of coincidence. Normalization coefficient 0.01.
|
OriginalRequestBodyBocm15K001
video_production: 731
The factor for the original request. It is considered according to the contents of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
OriginalRequestBodyWordCoverageExact
video_production: 732
The factor for the original request. It is considered according to the contents of the document. The degree of covering the words of the request in the exact form.
|
OriginalRequestTitleWordCoverageAny
video_production: 733
The factor for the original request. It is considered according to the heading of the document. The degree of coating of the words of the request (all types of hits).
|
LeftIsPorno
web_itditp: 0
Document from porn kitski
|
IsPorno
web_itditp: 1
Document from porn kitski
|
LeftIsComm
web_itditp: 4
A document from a commercial clay. Not used (depreded)
|
IsComm
web_itditp: 5
A document from a commercial clay. Not used (depreded)
|
LeftIsFake
web_itditp: 6
Fast document
|
IsFake
web_itditp: 7
Fast document
|
LeftIsSEO
web_itditp: 8
The page title contains commercial vocabulary. Not used (depreded)
|
IsSEO
web_itditp: 9
The page title contains commercial vocabulary. Not used (depreded)
|
LeftIsEShop
web_itditp: 10
Commercial page (Classifier Savina)
|
IsEShop
web_itditp: 11
Commercial page (Classifier Savina)
|
LeftHasPayments
web_itditp: 16
On the page there is about 'Payment SMS '.
|
HasPayments
web_itditp: 17
On the page there is about 'Payment SMS '.
|
LeftEshopValue
web_itditp: 22
Stage of the page
|
EshopValue
web_itditp: 23
Stage of the page
|
LeftPornoValue
web_itditp: 24
Pornography of the page
|
PornoValue
web_itditp: 25
Pornography of the page
|
LeftIsPornoAdvert
web_itditp: 26
On the Porn Advertising page
|
IsPornoAdvert
web_itditp: 27
On the Porn Advertising page
|
LeftPoetry
web_itditp: 28
The poetry of the document
|
Poetry
web_itditp: 29
The poetry of the document
|
LeftPoetryQuad
web_itditp: 30
The maximum poetry of the quatrain
|
PoetryQuad
web_itditp: 31
The maximum poetry of the quatrain
|
LeftSynS1
web_itditp: 32
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynS1
web_itditp: 33
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
LeftSynFLremap1
web_itditp: 34
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap1
web_itditp: 35
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
LeftSynFLremap2
web_itditp: 36
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
SynFLremap2
web_itditp: 37
Show how much the text is unnatural from the point of view of the Russian language. Assessment of how much the text of the document can be considered as a generated synonymizer or automatic. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAfermula/tekushhiekomponenty/antispam?v=1il#h58953-2 more))
|
LeftSynPercentBadWordPairs
web_itditp: 40
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
SynPercentBadWordPairs
web_itditp: 41
An indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad pairs of words in the text, transferred to the segment [0.1] according to the Z/(Z+10) formula
|
LeftSynNumBadWordPairs
web_itditp: 42
The proportion of bad steam among all found in the table: Z/(x+1), where Z 342 200 223 The number of bad couples in the text, and X 342 200 223 number ((http: //wiki.yandex- Team.ru/evgenijjjgrechnikov/testSynonimizers 2000-navigable)) steam
|
SynNumBadWordPairs
web_itditp: 43
The proportion of bad steam among all found in the table: Z/(x+1), where Z 342 200 223 The number of bad couples in the text, and X 342 200 223 number ((http: //wiki.yandex- Team.ru/evgenijjjgrechnikov/testSynonimizers 2000-navigable)) steam
|
LeftNumLatinLetters
web_itditp: 44
The number of Latin letters in the text (not counting the markings) driven into [0.1] formula n/(n+100)
|
NumLatinLetters
web_itditp: 45
The number of Latin letters in the text (not counting the markings) driven into [0.1] formula n/(n+100)
|
LeftRusWordsInText
web_itditp: 48
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
RusWordsInText
web_itditp: 49
The number of words in the text (the word is what the lemmeter selected) is displayed in [0.1] according to the formula x/(x+a)
|
LeftRusWordsInTitle
web_itditp: 50
The number of words of the Russian language in the title
|
RusWordsInTitle
web_itditp: 51
The number of words of the Russian language in the title
|
LeftMeanWordLength
web_itditp: 52
The average length of the word
|
MeanWordLength
web_itditp: 53
The average length of the word
|
LeftPercentWordsInLinks
web_itditp: 54
The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
PercentWordsInLinks
web_itditp: 55
The percentage of the number of words inside the tag <a> .. </a> from the number of all words
|
LeftPercentVisibleContent
web_itditp: 56
The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
PercentVisibleContent
web_itditp: 57
The percentage of the number of words outside the tags (outside the brackets <>) from the number of all words
|
LeftPercentFreqWords
web_itditp: 58
The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
PercentFreqWords
web_itditp: 59
The percentage of the number of words, which are 200 the most frequent words of the language, from the number of all words of the text
|
LeftPercentUsedFreqWords
web_itditp: 60
The number used in the text 500 of the most popular words of the language, divided by 500
|
PercentUsedFreqWords
web_itditp: 61
The number used in the text 500 of the most popular words of the language, divided by 500
|
LeftTrigramsProb
web_itditp: 62
Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
TrigramsProb
web_itditp: 63
Logarithm of average geometric probabilities of trigrams in the text. (the probability of a trigram - the number of its meetings in the text, divided by the number of all trigrams) is displayed in [0.1] according to the formula -x (x+a)
|
LeftTrigramsCondProb
web_itditp: 64
Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
TrigramsCondProb
web_itditp: 65
Logarithm of the average geometric conditional probabilities of trigrams. The conditional probability of a trigram is its probability, divided by the probability of a bigram from the first two words
|
LeftNumeralsPortion
web_itditp: 66
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
NumeralsPortion
web_itditp: 67
The share of different parts of speech in the text. The share of numerals (among all words that managed to recognize part of the speech)
|
LeftParticlesPortion
web_itditp: 68
The share of particles
|
ParticlesPortion
web_itditp: 69
The share of particles
|
LeftAdjPronounsPortion
web_itditp: 70
The share of pronoun adjectives
|
AdjPronounsPortion
web_itditp: 71
The share of pronoun adjectives
|
LeftAdvPronounsPortion
web_itditp: 72
The proportion of pronoun nouns
|
AdvPronounsPortion
web_itditp: 73
The proportion of pronoun nouns
|
LeftVerbsPortion
web_itditp: 74
The share of verbs
|
VerbsPortion
web_itditp: 75
The share of verbs
|
LeftFemAndMasNounsPortion
web_itditp: 76
The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'Hummingbird ' - an example of an indefinite kind that can be determined in two ways, 'Alexander ' - homonymy).
|
FemAndMasNounsPortion
web_itditp: 77
The share of words that can be both masculine nouns and nouns of the feminine, but not of the middle kind, among all nouns (examples: 'Hummingbird ' - an example of an indefinite kind that can be determined in two ways, 'Alexander ' - homonymy).
|
LeftLongestText
web_itditp: 80
The size of the largest text segment (from the factor [18] puretext)
|
LongestText
web_itditp: 81
The size of the largest text segment (from the factor [18] puretext)
|
LeftSegmentAuxAlphasInText
web_itditp: 94
Number of letters in the AUX segment
|
SegmentAuxAlphasInText
web_itditp: 95
Number of letters in the AUX segment
|
LeftSegmentAuxSpacesInText
web_itditp: 96
The number of spaces in the AUX segment
|
SegmentAuxSpacesInText
web_itditp: 97
The number of spaces in the AUX segment
|
LeftSegmentContentCommasInText
web_itditp: 98
The number of commas in the Content segment
|
SegmentContentCommasInText
web_itditp: 99
The number of commas in the Content segment
|
LeftIsShop
web_itditp: 100
Page 342 200 224 Shop. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/opisanijafaktorov#SSHOP Description)). Not used (depreded)
|
IsShop
web_itditp: 101
Page 342 200 224 Shop. ((http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayAformula/tekushhiekomponenty/opisanijafaktorov#SSHOP Description)). Not used (depreded)
|
LeftStaticTitleComm
web_itditp: 104
The degree of commerce page title. Not used (depreded)
|
StaticTitleComm
web_itditp: 105
The degree of commerce page title. Not used (depreded)
|
LeftStaticTitleBM25Ex
web_itditp: 106
BM25 page title by its text
|
StaticTitleBM25Ex
web_itditp: 107
BM25 page title by its text
|
LeftTrashAdv
web_itditp: 108
The greasy of the page
|
TrashAdv
web_itditp: 109
The greasy of the page
|
LeftLong
web_itditp: 281
Long document (the longer the document, the greater the value of the factor).
|
LeftPureText
web_itditp: 282
Long text without links.
|
LeftRusLang
web_itditp: 284
The language of the document is Russian.
|
LeftTextFeatures
web_itditp: 291
The quality of the text. It is considered a rather complex formula
|
LeftTextLike
web_itditp: 292
Text quality (classifier Alekseev)
|
LeftDocLen
web_itditp: 293
Document length in sentences
|
LeftIsHTML
web_itditp: 295
Document type - HTML
|
LeftAdultness
web_itditp: 302
equals 2 * NastyContent
|
LeftEngLang
web_itditp: 305
Document language - English
|
LeftCyrLang
web_itditp: 306
The language of the document is Cyrillic
|
LeftAuraDocLogShared
web_itditp: 308
Logarithm of the number of shingles on which this document is not unique
|
LeftAuraDocLogAuthor
web_itditp: 309
Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
LeftAuraDocMeanSharedWeight
web_itditp: 310
The average weight of non-ugly shingles of this document
|
LeftSoft404
web_itditp: 311
Page 342 200 224 '404 ' (share of tokens '404 ' in relation to the total number of tokens on the page)
|
LeftAuraDocLogOrigin
web_itditp: 312
Logarithm of the number of shingles in the document added by the owner of the site as original texts in ((http://wiki.yandex-team.ru/jandekspoisk/jekosistema/marketingPr/webmasters/plan/vtorcontect of originality plugin)). It does not participate in the formula, it is needed to disconnect the takes
|
LeftAuraDocMeanFltAuthorSource
web_itditp: 313
The average filtered number of sources of authorship of the document. It does not participate in the formula, it is needed to disconnect the takes
|
LeftLanguagePopularity
web_itditp: 314
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
LeftHasDownloadLinkOnFile
web_itditp: 333
The document has a direct link to the file
|
LeftHasDownloadLinkOnFileHosting
web_itditp: 334
The document has a link to filehosting
|
LeftHasUserReviews
web_itditp: 335
The document contains user review/comment
|
LeftDocCreateMonth
web_itditp: 336
The time of creating a document with an accuracy of up to a month 1.0 is the current month, 0 342 200 224- 10 years ago and older. Temporarily disconnected
|
LeftDocUpdateMonth
web_itditp: 337
The time for updating the document with an accuracy of up to a month 1.0 is the current month, 0 342 200 224- 10 years ago and older. Temporarily disconnected
|
LeftDaterStatsYearNormLikelihood
web_itditp: 338
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
LeftDaterStatsAverageSourceSegment
web_itditp: 339
The arithmetic mean position of dates in the document. Temporarily disconnected
|
LeftSegmentWordPortionFromMainContent
web_itditp: 341
The share of the words of the document from the segments with Score> 2.
|
Long
web_itditp: 376
Long document (the longer the document, the greater the value of the factor).
|
PureText
web_itditp: 377
Long text without links.
|
RusLang
web_itditp: 379
The language of the document is Russian.
|
TextFeatures
web_itditp: 386
The quality of the text. It is considered a rather complex formula
|
TextLike
web_itditp: 387
Text quality (classifier Alekseev)
|
DocLen
web_itditp: 388
Document length in sentences
|
IsHTML
web_itditp: 390
Document type - HTML
|
Adultness
web_itditp: 397
equals 2 * NastyContent
|
EngLang
web_itditp: 400
Document language - English
|
CyrLang
web_itditp: 401
The language of the document is Cyrillic
|
AuraDocLogShared
web_itditp: 403
Logarithm of the number of shingles on which this document is not unique
|
AuraDocLogAuthor
web_itditp: 404
Logarithm of the number of shingles on which this owner of the document is recognized as the author
|
AuraDocMeanSharedWeight
web_itditp: 405
The average weight of non-ugly shingles of this document
|
Soft404
web_itditp: 406
Page 342 200 224 '404 ' (share of tokens '404 ' in relation to the total number of tokens on the page)
|
AuraDocLogOrigin
web_itditp: 407
Logarithm of the number of shingles in the document added by the owner of the site as original texts in ((http://wiki.yandex-team.ru/jandekspoisk/jekosistema/marketingPr/webmasters/plan/vtorcontect of originality plugin)). It does not participate in the formula, it is needed to disconnect the takes
|
AuraDocMeanFltAuthorSource
web_itditp: 408
The average filtered number of sources of authorship of the document. It does not participate in the formula, it is needed to disconnect the takes
|
LanguagePopularity
web_itditp: 409
The popularity of the language of the document. Number from 0 to 1. (http://wiki.yandex-team.ru/jandekspoisk/kachestvopoiska/obshayaformula/tekushhiekomponenty/languaguaguagepopalarity)))))))
|
HasDownloadLinkOnFile
web_itditp: 428
The document has a direct link to the file
|
HasDownloadLinkOnFileHosting
web_itditp: 429
The document has a link to filehosting
|
HasUserReviews
web_itditp: 430
The document contains user review/comment
|
DocCreateMonth
web_itditp: 431
The time of creating a document with an accuracy of up to a month 1.0 is the current month, 0 342 200 224- 10 years ago and older. Temporarily disconnected
|
DocUpdateMonth
web_itditp: 432
The time for updating the document with an accuracy of up to a month 1.0 is the current month, 0 342 200 224- 10 years ago and older. Temporarily disconnected
|
DaterStatsYearNormLikelihood
web_itditp: 433
The function of the credibility of the distribution of years in the document. Temporarily disconnected
|
DaterStatsAverageSourceSegment
web_itditp: 434
The arithmetic mean position of dates in the document. Temporarily disconnected
|
SegmentWordPortionFromMainContent
web_itditp: 436
The share of the words of the document from the segments with Score> 2.
|
DssmLogDwellTimeBigramsDot
web_itditp: 606
|
DssmAggregatedAnnRegDot
web_itditp: 607
|
DssmMainContentKeywordsDot
web_itditp: 608
|
DssmBoostingXfWtd11Dot
web_itditp: 619
|
DssmBoostingXfWtd12Dot
web_itditp: 620
|
DssmBoostingXfWtd13Dot
web_itditp: 621
|
DssmBoostingXfWtd14Dot
web_itditp: 622
|
DssmBoostingXfWtd15Dot
web_itditp: 623
|
DssmBoostingXfWtd22Dot
web_itditp: 624
|
DssmBoostingXfWtd23Dot
web_itditp: 625
|
DssmBoostingXfWtd24Dot
web_itditp: 626
|
DssmBoostingXfWtd25Dot
web_itditp: 627
|
DssmBoostingXfWtd33Dot
web_itditp: 628
|
DssmBoostingXfWtd34Dot
web_itditp: 629
|
DssmBoostingXfWtd35Dot
web_itditp: 630
|
DssmBoostingXfWtd44Dot
web_itditp: 631
|
DssmBoostingXfWtd45Dot
web_itditp: 632
|
DssmBoostingXfWtd55Dot
web_itditp: 633
|
DssmBoostingXfOneDot
web_itditp: 634
|
DssmBoostingXfOneSeDot
web_itditp: 635
|
DssmBoostingCtrDot
web_itditp: 636
|
DssmBoostingXfOneSeAmSsHardDot
web_itditp: 657
|
DssmLogDwellTimeBigramsL3Dot
web_itditp: 663
dot product of DocDssmLogDwellTimeBigramsL3Embedding for left and right urls
|
DssmLogDTBigramsAMHards
web_itditp: 748
|
DssmPantherTermsDot
web_itditp: 752
|
RecDssmSpyTitleDomainCompressedEmb12Dot
web_itditp: 754
RecDssmSpyTitleDomainEmb12Dot
|
RecCFSharpDomainDot
web_itditp: 755
RecCFSharpDomainDot
|
WatchLogUserHistoryHostClusterDssmDot
web_itditp: 756
|
SpyLogUserHistoryHostClusterDssmDot
web_itditp: 757
|
L1DssmBigrams
web_l1: 9
DSSM model trained on clicks. Takes bigrams into account. Embeddings for documents are computed offline.
|
L1DssmMainContentKeywords
web_l1: 11
Query-MainContentKeywords similarity, target: logDwellTime
|
L1DssmBigramsMin
web_l1: 97
|
L1DssmBigramsMetaRatioMin
web_l1: 98
|
L1DssmBigramsMax
web_l1: 99
|
L1DssmBigramsMetaRatioMax
web_l1: 100
|
L1DssmBigramsAvg
web_l1: 101
|
L1DssmBigramsQ90
web_l1: 102
|
L1DssmBigramsMetaRatioQ90
web_l1: 103
|
L1DssmBigramsQ95
web_l1: 104
|
L1DssmBigramsMetaRatioQ95
web_l1: 105
|
L1DssmBigramsQ99
web_l1: 106
|
L1DssmBigramsMetaRatioQ99
web_l1: 107
|
L1DssmPantherTerms
web_l1: 120
|
TitleBclmMixPlainK000001
web_l2: 2
Linguistic boosting factor.
|
TitleBocm15K001
web_l2: 3
Linguistic boosting factor.
|
TitleCMMatchTop5AvgMatchValue
web_l2: 4
Linguistic boosting factor.
|
TitleWordCoverageForm
web_l2: 5
Linguistic boosting factor.
|
TitleAttenV1Bm15K05
web_l2: 6
Linguistic boosting factor.
|
BodyBclmMixPlainK000001
web_l2: 7
Linguistic boosting factor.
|
BodyCosineMatchMaxPrediction
web_l2: 8
Linguistic boosting factor.
|
BodyAllWcmWeightedPrediction
web_l2: 9
Linguistic boosting factor.
|
BodyBocm15K0_01
web_l2: 10
Linguistic boosting factor.
|
BodyQueryPartMatchSumValueAny
web_l2: 11
Linguistic boosting factor.
|
BodyWordCoverageForm
web_l2: 12
Linguistic boosting factor.
|
BodyWordCoverageExact
web_l2: 13
Linguistic boosting factor.
|
BodyBm15MaxAnnotationK0_01
web_l2: 14
Linguistic boosting factor.
|
QfufAllSumW2FSumWFullTextPerWordCMMaxPredictionMin
web_l2: 15
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The amount by the square of the expansion, multiplied by the value of the factor. Normalized for the total weight of extensions. Compositional TR-Like stream, which includes URL, headline, and the body of the document. PerwordCmmaxMatchMin algorithm: At least according to the maximum of the CMMAXMATCH weight abstracts.
|
QfufAllMaxFFullTextBocm15K001
web_l2: 16
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest value of the factor. Compositional TR-Like stream, which includes URL, headline, and the body of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
QfufAllMaxFFullTextBclmWeightedProximity1Bm15Size1K0001
web_l2: 17
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest value of the factor. Compositional TR-Like stream, which includes URL, headline, and the body of the document. Word Libra aggregation algorithm: BCLMWEIDEDPROXIMITY1BM15SIZE1. Normalization coefficient 0.001.
|
QfufAllMaxFFullTextTRBclmLite
web_l2: 18
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest value of the factor. Compositional TR-Like stream, which includes URL, headline, and the body of the document. The algorithm for aggregation of the scales of words: bclmlite, adaptation of the corresponding TR factor.
|
QfufAllMaxWFSumWFullTextAllWcmMatch80AvgValue
web_l2: 19
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The greatest balanced value of the factor. Normalized for the total weight of extensions. Compositional TR-Like stream, which includes URL, headline, and the body of the document. The sum of the scales of words, balanced by the weight of the annotation, is normalized for the sum of the scales of words. Only annotations are calculated, on which the sum of words of words is more than 80%.
|
QfufTopSumWFSumWFullTextBocm15K001
web_l2: 20
Linguistic boosting factor. Type of extensions: QFUF. Aggregation by TOP-10 (by the value of the factor) extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Compositional TR-Like stream, which includes URL, headline, and the body of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
QfufAllSumW2FSumWFullTextTRBclmLite
web_l2: 21
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. The amount by the square of the expansion, multiplied by the value of the factor. Normalized for the total weight of extensions. Compositional TR-Like stream, which includes URL, headline, and the body of the document. The algorithm for aggregation of the scales of words: bclmlite, adaptation of the corresponding TR factor.
|
QfufAllSumWFSumWFullTextPerWordCMMaxPredictionMin
web_l2: 22
Linguistic boosting factor. Type of extensions: QFUF. Aggregation on all extensions. A suspended sum of the Libra of factors. Normalized for the total weight of extensions. Compositional TR-Like stream, which includes URL, headline, and the body of the document. PerwordCmmaxMatchMin algorithm: At least according to the maximum of the CMMAXMATCH weight abstracts.
|
QueryToTextTopMinFBodyWordCoverageExact
web_l2: 23
Linguistic boosting factor.
|
QueryToTextAllMaxWFSumWBodyAllWcmWeightedValue
web_l2: 24
Linguistic boosting factor.
|
OriginalRequestFullTextTRTxtBm25ExactK1
web_l2: 44
|
OriginalRequestFullTextTRTxtBm25K1
web_l2: 45
|
OriginalRequestFullTextTRTxtBm25SynonymW1K1
web_l2: 46
|
OriginalRequestFullTextTRTextForms
web_l2: 47
|
OriginalRequestFullTextTRTextWeightedForms
web_l2: 48
|
OriginalRequestFullTextTRNumWordsSynonym
web_l2: 49
|
OriginalRequestFullTextOldTRAttenTRTxtBm25SynonymK1
web_l2: 50
|
OriginalRequestFullTextTxtHeadTRTxtBm25SynonymK0
web_l2: 51
|
OriginalRequestFullTextTxtHeadTRTxtBm25ExactK0
web_l2: 52
|
OriginalRequestFullTextTxtHeadTRTxtBm25K0
web_l2: 53
|
OriginalRequestFullTextTxtHiRelTRTxtBm25SynonymK0
web_l2: 54
|
OriginalRequestFullTextTxtHiRelTRTxtBm25ExactK0
web_l2: 55
|
OriginalRequestFullTextTxtHiRelTRTxtBm25K0
web_l2: 56
|
OriginalRequestFullTextTRBclmLite
web_l2: 57
|
OriginalRequestFullTextTRTxtPair
web_l2: 58
|
OriginalRequestFullTextTRTxtPairExact
web_l2: 59
|
OriginalRequestFullTextTRTxtPairW1
web_l2: 60
|
OriginalRequestFullTextTRTxtBreakSynonym
web_l2: 61
|
PantherTermsDssmModelQfufLbTop5MinWF
web_l2: 62
DSSM model of panther terms. Embed -expanded bar are used for L2. Top5 QFUF of Cosine proximity to the Embed document is filtered. As a factor, a minimum of the weight of the expansion of proximity is used.
|
PantherTermsDssmModelQfufLbTop5SumWFNormedSumW
web_l2: 63
DSSM model of panther terms. Embed -expanded bar are used for L2. Top5 QFUF of Cosine proximity to the Embed document is filtered. Next, the weighed amount of proximity is calculated, the weight of the expansion is used in the quality of weight, normalized for the amount of the weights of filtered extensions.
|
PantherTermsDssmModelQfufLbSumFNormedCount
web_l2: 64
DSSM model of panther terms. Embed -expanded bar are used for L2. As a factor, the average proximity to the Embed of the document is used.
|
PantherTermsDssmModelQfufLbMinWF
web_l2: 65
DSSM model of panther terms. Embed -expanded bar are used for L2. As a factor, a minimum of the weight of the expansion of proximity is used.
|
FioFromOriginalRequestBodyChain0Wcm
web_l2: 66
The factor according to the name from the original request is considered according to the contents of the document. Algorithm: Chain0wcm
|
FioFromOriginalRequestBodyMinWindowSize
web_l2: 67
The factor according to the name from the original request is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
FioFromOriginalRequestTextCosineMatchMaxPrediction
web_l2: 68
Factor for name from the original request text of the document. Algorithm Cosinematchmaxpredical.
|
AllFioFromOriginalRequestAllMaxFBodyChain0Wcm
web_l2: 69
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. Algorithm: Chain0wcm
|
AllFioFromOriginalRequestAllMaxFBodyMinWindowSize
web_l2: 70
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; It is considered according to the contents of the document. The minimum window size, which includes all the words of the request. It is normalized for the number of words in the request.
|
AllFioFromOriginalRequestAllMaxFTextCosineMatchMaxPrediction
web_l2: 71
The factor for all the name from the original request Aggregation on all extensions. Type of aggregation for extensions: the greatest value of the factor; The text of the document. Algorithm Cosinematchmaxpredical.
|
IsConstraintPassed
web_l2: 72
In the case of the presence of concrete, 1 is affixed upon their passage, otherwise 0.5, in the absence of 0
|
TelFullAttributeTextBocm15K001
web_l2: 73
The factor for telephone attributes Tel_Full from the original request text of the document. Algorithm for aggregation of the scales of words BOCM15. Normalization coefficient 0.01.
|
DssmFullSplitBert
web_l2: 74
|
AliceAramusicDssmL2
web_l2: 76
|
RightDssmLogDwellRegChain
web_meta_itditp: 85
|
MetaMetaResidDssmLogDwellTimeBigramsDot
web_meta_itditp: 104
Meta:Resid metafactor on web_itditp:DssmLogDwellTimeBigramsDot(606)
|
MetaSDDFT_GREATER_CNTDssmBoostingXfOneSeAmSsHardDot
web_meta_itditp: 106
SD:DFT_GREATER_CNT metafactor on web_itditp:DssmBoostingXfOneSeAmSsHardDot(657)
|
MetaMetaResidSynS1
web_meta_itditp: 135
Meta:Resid metafactor on web_itditp:SynS1(33)
|
MetaMetaRmseRusWordsInText
web_meta_itditp: 136
Meta:Rmse metafactor on web_itditp:RusWordsInText(49)
|
MetaMetaResidMinDssmLogDwellTimeBigramsDot
web_meta_itditp: 143
Meta:ResidMin metafactor on web_itditp:DssmLogDwellTimeBigramsDot(606)
|
MetaMutualSerpDFT_MAXDssmBoostingXfWtd45Dot
web_meta_itditp: 144
MutualSerp:DFT_MAX metafactor on web_itditp:DssmBoostingXfWtd45Dot(632)
|
PantherDwelltimeDot
web_meta_itditp: 149
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120LessUserHistory
web_meta_itditp: 160
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120MoreUserHistory
web_meta_itditp: 161
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120LessUserHistory
web_meta_itditp: 162
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120MoreUserHistory
web_meta_itditp: 163
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120LessUserHistory
web_meta_itditp: 164
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120MoreUserHistory
web_meta_itditp: 165
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120LessSpyLog
web_meta_itditp: 172
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120MoreSpyLog
web_meta_itditp: 173
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120LessSpyLog
web_meta_itditp: 174
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120MoreSpyLog
web_meta_itditp: 175
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120LessSpyLog
web_meta_itditp: 176
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120MoreSpyLog
web_meta_itditp: 177
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120LessWatchLog
web_meta_itditp: 184
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120MoreWatchLog
web_meta_itditp: 185
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120LessWatchLog
web_meta_itditp: 186
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120MoreWatchLog
web_meta_itditp: 187
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120LessWatchLog
web_meta_itditp: 188
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120MoreWatchLog
web_meta_itditp: 189
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120Less
web_meta_pers: 0
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc01daysDwt120More
web_meta_pers: 1
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120Less
web_meta_pers: 2
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc05daysDwt120More
web_meta_pers: 3
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.5days, dwelltime more than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120Less
web_meta_pers: 4
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDoc3daysDwt120More
web_meta_pers: 5
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=3days, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120LessTopavg1
web_meta_pers: 16
Top1 weighted average cosine similarity between document and clustered embedding of documents from user_history (model=LogDwellTimeBigrams, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120MoreTopavg1
web_meta_pers: 17
Top1 weighted average cosine similarity between document and clustered embedding of documents from user_history (model=LogDwellTimeBigrams, dwelltime less than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120LessTopavg2
web_meta_pers: 18
Top2 weighted average cosine similarity between document and clustered embedding of documents from user_history (model=LogDwellTimeBigrams, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120MoreTopavg2
web_meta_pers: 19
Top2 weighted average cosine similarity between document and clustered embedding of documents from user_history (model=LogDwellTimeBigrams, dwelltime less than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120LessCounter30
web_meta_pers: 20
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.30 (model=LogDwellTimeBigrams, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120MoreCounter30
web_meta_pers: 21
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.30 (model=LogDwellTimeBigrams, dwelltime less than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120LessCounter60
web_meta_pers: 22
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.60 (model=LogDwellTimeBigrams, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120MoreCounter60
web_meta_pers: 23
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.60 (model=LogDwellTimeBigrams, dwelltime less than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120LessCounter90
web_meta_pers: 24
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.90 (model=LogDwellTimeBigrams, dwelltime more than 120sec)
|
ClusteredEmbLogDwelltimeBigramsDocDwt120MoreCounter90
web_meta_pers: 25
cnt / (1 + cnt), cnt = sum of weights where cosine between document and clustered embedding of documents from user_history > 0.90 (model=LogDwellTimeBigrams, dwelltime less than 120sec)
|
FadingEmbLogDwelltimeBigramsDocDays01Dwt30More
web_meta_pers: 77
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 30sec)
|
FadingEmbLogDwelltimeBigramsDocDays01Dwt300More
web_meta_pers: 78
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 300sec)
|
FadingEmbLogDwelltimeBigramsDocDays01Dwt600More
web_meta_pers: 79
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=0.1days, dwelltime more than 600sec)
|
FadingEmbLogDwelltimeBigramsDocDays12Dwt90More
web_meta_pers: 80
Cosine similarity between document and fading embedding of documents from RTMR user_history (model=LogDwellTimeBigrams, fadingCoef=12days, dwelltime more than 90sec)
|
ClustEmbLogDtBigramsDocDwt120LessWeights0AllScoreThreshold95pSatweightsum
web_meta_pers: 89
Take clustered embeddings (short clicks), Calc dot products with doc embedding, Take dot product more than 0.95, f = 1 / (1 + sum of weights)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights0AllMaxscore
web_meta_pers: 97
Take clustered embeddings (long clicks), Calc dot products with doc embedding, f = Max(DotProduct)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights0AllMaxScorexweight
web_meta_pers: 98
Take clustered embeddings (long clicks), Calc dot products with doc embedding, f = AVG(DotProduct[i] * Weight[i])
|
ClustEmbLogDtBigramsDocDwt120MoreWeights0AllScoreThreshold95pSatweightsum
web_meta_pers: 99
Take clustered embeddings (long clicks), Calc dot products with doc embedding, Take dot product more than 0.95, f = 1 / (1 + sum of weights)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessMaxscore
web_meta_pers: 100
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, f = Max(DotProduct)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessMintop5pScore
web_meta_pers: 101
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, sort descending, f = DotProducts[0.05 * DotProduct.size()]
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessMintop30pScore
web_meta_pers: 102
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, sort descending, f = DotProducts[0.30 * DotProduct.size()]
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessMinscore
web_meta_pers: 103
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, f = Min(DotProducts)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessAvgtop5pScorexweight
web_meta_pers: 104
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, sort descending, take top5, f = AVG(DotProduct[i] * Weight[i])
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5LessScoreThreshold30pSatweightsum
web_meta_pers: 105
Take clustered embeddings (long clicks, centroids with weights less than 5), Calc dot products with doc embedding, Take dot product more than 0.30, f = 1 / (1 + sum of weights)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights5MoreMintop10pScore
web_meta_pers: 106
Take clustered embeddings (long clicks, centroids with weights more than 5), Calc dot products with doc embedding, sort descending, f = DotProducts[0.10 * DotProduct.size()]
|
ClustEmbLogDtBigramsDocDwt120MoreWeights10LessMaxscore
web_meta_pers: 107
Take clustered embeddings (long clicks, centroids with weights less than 10), Calc dot products with doc embedding, f = Max(DotProduct)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights10LessMintop10pScore
web_meta_pers: 108
Take clustered embeddings (long clicks, centroids with weights less than 10), Calc dot products with doc embedding, sort descending, f = DotProducts[0.10 * DotProduct.size()]
|
ClustEmbLogDtBigramsDocDwt120MoreWeights20LessMaxScorexweight
web_meta_pers: 109
Take clustered embeddings (long clicks, centroids with weights less than 20), Calc dot products with doc embedding, f = AVG(DotProduct[i] * Weight[i])
|
ClustEmbLogDtBigramsDocDwt120MoreWeights20LessScoreThreshold70pSatweightsum
web_meta_pers: 110
Take clustered embeddings (long clicks, centroids with weights less than 20), Calc dot products with doc embedding, Take dot product more than 0.70, f = 1 / (1 + sum of weights)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights30LessMaxscore
web_meta_pers: 111
Take clustered embeddings (long clicks, centroids with weights less than 30), Calc dot products with doc embedding, f = Max(DotProduct)
|
ClustEmbLogDtBigramsDocDwt120MoreWeights30LessAvgtop20pScorexweight
web_meta_pers: 112
Take clustered embeddings (long clicks, centroids with weights less than 30), Calc dot products with doc embedding, sort descending, take top20, f = AVG(DotProduct[i] * Weight[i])
|
FadingEmbLogDwelltimeBigramsQueryXDoc001days
web_meta_pers: 116
Cosine similarity between doc embedding and fading embedding of queries from RTMR user_history, (model=LogDwellTimeBigrams, fadingCoef=0.01days)
|
ClustEmbLogDtBigramsQueryXDocWeights3LessMintop5pScore
web_meta_pers: 130
Take clustered embeddings (query embeddings, weights less than 3), Calc dot product with doc embedding, Sort descending, f = DotProduct[0.05 * length(DotProducts)]
|
ClustEmbLogDtBigramsQueryXDocWeights3LessMinscore
web_meta_pers: 131
Take clustered embeddings (query embeddings, weights less than 3), Calc dot product with doc embedding, f = MIN(DotProducts)
|
ClustEmbLogDtBigramsQueryXDocWeights5LessMinscore
web_meta_pers: 132
Take clustered embeddings (query embeddings, weights less than 5), Calc dot product with doc embedding, f = MIN(DotProducts)
|
ClustEmbLogDtBigramsQueryXDocWeights15LessScoreThreshold45pSatweightsum
web_meta_pers: 133
Take clustered embeddings (query embeddings, weights less than 15), Calc dot product with doc embedding, Take dot product with score > 0.45, f = 1 / (1 + sum of weights)
|
ClustEmbLogDtBigramsQueryXDocAllWeightsMaxScoreXWeight
web_meta_pers: 134
Take clustered embeddings (query embeddings, all weights), Calc dot product with doc embedding, f = MAX(Score[i] * Weights[i])
|
ClustEmbLogDtBigramsQueryXDocAllWeightsScoreThreshold55pSatweightsum
web_meta_pers: 135
Take clustered embeddings (query embeddings, all weights), Calc dot product with doc embedding, Take dot product with score > 0.55, f = 1 / (1 + sum of weights)
|
LogDtBigramsUserLast10QueriesXDocMaxScoreXWeight
web_meta_pers: 137
Take last 10 query embeddings, Calc dot product with doc embedding, f = MAX(Score[i] * Weights[i])
|
MetaAvgTRp1
web_meta: 0
Avg metafactor on TRp1(4)
|
MetaAvgSegmentWordPortionFromMainContent
web_meta: 8
Avg metafactor on SegmentWordPortionFromMainContent(723)
|
MetaRmsTRp1
web_meta: 22
Rms metafactor on TRp1(4)
|
MetaRmsHasNoAllWordsTRSy
web_meta: 27
Rms metafactor on HasNoAllWordsTRSy(138)
|
MetaRmsSynPercentBadWordPairs
web_meta: 29
Rms metafactor on SynPercentBadWordPairs(353)
|
MetaAuxTextBM25Avg
web_meta: 65
The average on PRS value of the AuxtextBM25 factor.
|
MetaAuxTitleBM25NonZero
web_meta: 66
The share of non -equal values ​​on the PRS factor Auxtitlebm25.
|
MetaBOCMAvg
web_meta: 69
The share of the maximum value on request on PRS values ​​of the BCLM factor.
|
SDWeb1040Max
web_meta: 99
Maximum XFDTSHOWTOPMINWFFELD3BCLMWEIGHTEDFLOGW0K0001 Factor, calculated according to all similar documents
|
SDWeb1094SumWFNormSumW
web_meta: 104
Weighted TEXTBM11NORM16384 amount of factors according to similar documents, normalized by the amount of weights of similar documents
|
MetaAvgHasDownloadLinkOnFile
web_meta: 109
Avg metafactor on HasDownloadLinkOnFile(avg 682)
|
MetaWeb380Web685ProductInvAvg
web_meta: 134
The average on PRS value of the expression (1 - Dateraage) * (1 - DiversityCategreview)
|
MetaWeb365Web487ProductInvAvg
web_meta: 137
The average on PRS value of the expression (1 - ruswordsintitle) * (1 - GSKURLMODEL)
|
MetaWeb201Web494ProductPos
web_meta: 140
Haspayments * QueryCommercialitymx position among all documents on PRS
|
MetaWeb1025Web1181ProductPos
web_meta: 142
Position XFDTSHOWALMAXFELDSET2BM15FLOGK0001 * SimpleClickMixMatchHTEDVALUE among all documents on PRS
|
MetaWeb1099Web1219ProductInvPos
web_meta: 144
Position (1- Fieldset3bclmweightedflogw0k0001) * (1 - DSSMLONGMIDLESHORDSHARDCLICS) Among all documents on PRS
|
MetaResidMaxDBMNumbers
web_meta: 168
ResidMax metafactor on Production:DBMNumbers(662)
|
MetaAvgBodyMinWindowSize
web_meta: 179
Avg metafactor on Production:BodyMinWindowSize(1103)
|
MetaMaxDssmMiddleVsShortLongHardNoClicks
web_meta: 182
Max metafactor on Production:DssmMiddleVsShortLongHardNoClicks(1221)
|
MetaMaxDssmLogDwellTimeBigrams
web_meta: 186
Max metafactor on Production:DssmLogDwellTimeBigrams(1338)
|
MetaRmsMeanWordLength
web_meta: 188
Rms metafactor on Production:MeanWordLength(366)
|
MetaNonzeroDaterAge
web_meta: 189
Nonzero metafactor on Production:DaterAge(380)
|
MetaRmseAdvPronounsPortion
web_meta: 190
Rmse metafactor on Production:AdvPronounsPortion(402)
|
MetaAvgLongestText
web_meta: 191
Avg metafactor on Production:LongestText(410)
|
MetaRmseQueryWordSequencesTR
web_meta: 194
Rmse metafactor on Production:QueryWordSequencesTR(504)
|
MetaResidDBMNumbers
web_meta: 197
Resid metafactor on Production:DBMNumbers(662)
|
MetaResidMaxQueryDocTitleRangesMatchingScore
web_meta: 201
ResidMax metafactor on Production:QueryDocTitleRangesMatchingScore(866)
|
MetaMaxXfDtShowAllMaxFFieldSetUTBm15FLogW0
web_meta: 203
Max metafactor on Production:XfDtShowAllMaxFFieldSetUTBm15FLogW0(1027)
|
MetaDFT_GREATER_CNTDssmLogDwellTimeBigrams
web_meta: 204
DFT_GREATER_CNT metafactor on Production:DssmLogDwellTimeBigrams(1338)
|
MetaDFT_SUM_WF_NORM_SUM_WDssmLogDwellTimeBigrams
web_meta: 205
DFT_SUM_WF_NORM_SUM_W metafactor on Production:DssmLogDwellTimeBigrams(1338)
|
MetaDFT_GREATER_CNTRequestWithRegionNameFieldSet2Bm15FLogK0001
web_meta: 206
DFT_GREATER_CNT metafactor on Production:RequestWithRegionNameFieldSet2Bm15FLogK0001(1264)
|
MetaEpsHashShareTitleIdfSumFixed
web_meta: 241
Eps hash metafactor on Production:TitleIdfSumFixed(358)
|
MetaEpsHashShareNationalLanguage
web_meta: 243
Eps hash metafactor on Production:NationalLanguage(553)
|
MutualSerpSimDftMaxBodyMinWindowSize
web_meta: 252
Similar documents, type of similarity: mutualserp. Max Aggregation of Bodyminwindowsize factor
|
MutualSerpSimDftSumWfNormSumWDssmBigramsQueryDerivativeMax
web_meta: 253
Similar documents, type of similarity: mutualserp. SUMWFNORMSUMW AGGENT DSSMBIGRAMSQUERYDERIVATIMAX factor
|
MutualSerpSimDftMaxDssmLogDwellTimeBigrams
web_meta: 254
Similar documents, type of similarity: mutualserp. Max Aggregion DSSMLOGDWELTIMEBIGRAMS Factor
|
MutualSerpSimDftSumWfNormSumWHasNoAllWordsTRSy
web_meta: 255
Similar documents, type of similarity: mutualserp. SUMWFNORMSUMW HasnoallwordStrsy factor
|
MutualSerpSimDftSumWfNormSumWDssmLongVsMiddleShortNoClicks
web_meta: 257
Similar documents, type of similarity: mutualserp. Sumwfnormsumw DSSMLONGVSMIDLESHORTNOCLICS AGGENT
|
MutualSerpSimDftMaxDssmLongMiddleShortVsHardClicks
web_meta: 259
Similar documents, type of similarity: mutualserp. Max Agnigation DSSMLONGMIDLESHORDSHARDCLICS Factor
|
MetaMetaAvgDssmQueryDwellTime
web_meta: 328
Meta:Avg metafactor on web_production:DssmQueryDwellTime(1406)
|
MetaMetaRmseDssmLogDtBigramsAMHardQueriesNoClicks
web_meta: 329
Meta:Rmse metafactor on web_production:DssmLogDtBigramsAMHardQueriesNoClicks(1523)
|
MetaMetaResidMaxXfDtShowKnnTopSumW2FSumWFieldSet1Bm15FLogK0001
web_meta: 331
Meta:ResidMax metafactor on web_production:XfDtShowKnnTopSumW2FSumWFieldSet1Bm15FLogK0001(1583)
|
MetaMetaResidMaxDssmLogDtBigramsAMHardQueriesNoClicksMixed
web_meta: 332
Meta:ResidMax metafactor on web_production:DssmLogDtBigramsAMHardQueriesNoClicksMixed(1596)
|
MetaMetaEpsHashShareDssmLogDtBigramsAMHardQueriesNoClicksMixed
web_meta: 333
Meta:EpsHashShare metafactor on web_production:DssmLogDtBigramsAMHardQueriesNoClicksMixed(1596)
|
MetaMetaResidDocLen
web_meta: 337
Meta:Resid metafactor on web_production:DocLen(110)
|
MetaMetaResidSynFLremap1
web_meta: 338
Meta:Resid metafactor on web_production:SynFLremap1(335)
|
MetaMetaResidQueryToTextAllSumFCountTextBocm11Norm256
web_meta: 344
Meta:Resid metafactor on web_production:QueryToTextAllSumFCountTextBocm11Norm256(1400)
|
MetaMetaResidDssmQueryDwellTime
web_meta: 345
Meta:Resid metafactor on web_production:DssmQueryDwellTime(1406)
|
MetaMetaResidDssmLogDtBigramsAMHardQueriesNoClicksMixed
web_meta: 348
Meta:Resid metafactor on web_production:DssmLogDtBigramsAMHardQueriesNoClicksMixed(1596)
|
MetaFractNeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards
web_meta: 413
Fract metafactor on NeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards(1845).
|
MetaResidNeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards
web_meta: 414
Resid metafactor on NeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards(1845).
|
MetaEpsHashShareNeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards
web_meta: 415
EpsHashShare metafactor on NeuroTextModelLongClickPredictorByWordAndBigramCountersWithSSHards(1845).
|
MetaSameWordsMaskDispersionFieldSetUTSynonymAddTime
web_meta: 417
Dispersion of the Addtime factor in terms of PRS, where the mask of words in the URL+Title document that coincided as a synonym or better, the same as the document
|
MetaSameWordsMaskDispersionFieldSetUTSynonymDBMNumbers
web_meta: 418
DBMNUMBERS factor dispersion in terms of PRS, where are the words mask in the URL+Title document that coincided as a synonym or better, the same as the document
|
MetaSameWordsMaskDispersionFieldSetUTSynonymDoubleFrcAllWcmMatch95AvgValue
web_meta: 419
Dispersion of the DoubleFrcallwcmmatch95AVGVALUE factor in terms of PRS, where a mask of words in the URL+Title document that coincided as a synonym or better, the same as a document
|
MetaSameWordsMaskAvgFieldSetUTSynonymDoubleFrcAllWcmMatch95AvgValue
web_meta: 420
The average value of the DoubleFrCallwcmmatch95AVGVALUE factor in PRS, where the words mask in the URL+Title document that coincided as a synonym or better, the same as the document
|
MetaSameWordsMaskMaxTextExactOneClickFrcXfSpBclmWeightedProximity1Bm15Size1K001
web_meta: 421
The maximum value of the OneClickFrCXFSPBCLMWEIDEDPROXIMIMIMITY1BM15SIZE1K001 factor in part of the PRS, where the words mask in the text of the document that coincided exactly the same as that of the document
|
MetaSameWordsMaskMaxTextExactFirstClickDtXfBclmPlaneProximity1Bm15W0Size1K0001
web_meta: 422
The maximum value of the factor is FIRSTCLICKDTXFBCLMPLANEPROXIMITY15W0SIZE1K0001 in PRS, where a mask of words in the text of the document that coincided exactly the same as that of the document
|
MetaSameWordsMaskDispersionFieldSetUTSynonymOneClickFrcXfSpPerWordCMMaxMatchMin
web_meta: 423
The dispersion of the OneClickFrCXFSPPERWORDCMAXMATCHMIN factor in terms of PRS, where the words mask in the URL+Title document that coincided as a synonym or better, the same as the document
|
MetaSameWordsMaskAvgFieldSetUTSynonymOneClickFrcXfSpPerWordCMMaxMatchMin
web_meta: 424
The average value of the OneClickFrCXFSPPERWORDCMAXMATCHMIN factor in terms of PRS, where a mask of words in the URL+Title document that coincided as a synonym or better, is the same as that of the document
|
MetaSameWordsMaskDispersionFieldSetUTSynonymQfufAllSumWFSumWQueryDwellTimeMixMatchWeightedValue
web_meta: 425
The dispersion of the factor QFUFALLSUMWFSUMWQURYDWELTIMEXMATCHWEIGHTEDVALE in PRS, where the words mask in the URL+Title document, coincided with a synonym or better, are the same as the document of
|
MetaSameWordsMaskMinFieldSetUTSynonymDssmQueryDwellTime
web_meta: 426
The minimum value of the DSSMQUERYDWELTIME factor in PRS, where the mask of words in the URL+Title document that coincided or better, is the same as the document
|
MetaSameWordsMaskAvgTextExactQfufAllMaxFLinkAnnIndicatorAnnotationMaxValueWeighted
web_meta: 427
The average value of the QFUFUFALLMAXFLINKANNINNINNONNOTATIONNOTATIONNOWEWEWEWEIGHTED factor in PRS, where a mask of words in the text of a document that coincided exactly the same as that of the document
|
MetaSameWordsMaskAvgFieldSetUTSynonymQfufAllMaxWFLinkAnnIndicatorFullMatchValue
web_meta: 428
The average value of the QFUFUFALLMAXWFLINKANNINNINNINNINNINNDILLMATCHVALUE factor in terms of PRS, where a mask of words in the URL+Title document that coincided or better, the same as the document
|
MetaSameWordsMaskAvgTextExactRandomLogHostQClassDownloadAvg
web_meta: 429
The average value of the RandomloghostqClassdownloadavg factor in PRS, where the mask of words in the text of the document that coincided for sure, is the same as the document
|
MetaSameWordsMaskAvgFieldSetUTSynonymRandomLogHostIsMusicAvg
web_meta: 430
The average value of the Randomloghostismusicavg factor in PRS, where the mask of words in the URL+Title document that coincided as a synonym or better, is the same as the document
|
MetaSameWordsMaskAvgFieldSetUTSynonymRandomLogOwnerMetaWeb1099Web1219ProductInvPosLogAvg
web_meta: 431
The average value of the factor Randomlogownermetaweb1099Web1219Productinvposlogavg in part of PRS, where a mask of words in the URL+Title document that coincided or better, is the same as the document of
|
MetaSameWordsMaskAvgFieldSetUTSynonymRandomLogOwnerDssmRandomLogQueryAvgHasPaymentsLogAvg
web_meta: 432
The average value of the RandomlogownerdomrandomlogQuevghaspaymentslogavg factor in PRS, where the words mask in the URL+Title document that coincided or better, the same as the document is
|
MetaSameWordsMaskDispersionTextExactRandomLogOwnerPornoQueryLogAvg
web_meta: 433
Dispersion of the RandomlogownernoQuerylogavg factor in terms of PRS, where the mask of words in the text of the document that coincided for sure, the same as the document
|
MetaSameWordsMaskDispersionTextExactRandomLogOwnerNationalLanguageLogAvg
web_meta: 434
The dispersion of the factor Randomlogownernationalnguagelogavg in the PRS part, where the mask of words in the text of the document, coincided for sure, is the same as the document
|
MetaSameWordsMaskAvgFieldSetUTSynonymQfufFilteredByXfOneSeTopSumWFSumWBodyMinWindowSize
web_meta: 435
The average value of the QFUFFIFILTEDBYXFONESETOPSUMWBODYMINWIDOWINDOWSIZE factor in PRS, where the words mask in the URL+Title document that coincided or better, the same as the document is
|
MetaMetaResidDssmCtrEngSsHard
web_meta: 445
Meta:Resid metafactor on web_production:DssmCtrEngSsHard(1855)
|
AliceAramusicBert
web_meta: 510
bert model which predicts 1rel 2vital 0other for arabic alice music search scenario
|
AliceAramusicBertTest
web_meta: 511
slot for support of two parallel aramusic models
|
BertSinsigMSE
web_meta: 522
small-bert-model sbr:1721013142 predict Predict_sinsig_mse_1502766203_standartized_mse_target - Distillation of the sinsig mse bert model 1502766203
|
BertDBDMSE
web_meta: 523
small-bert-model sbr:1721013142 predict Predict_dbd_mse_1502778723_standartized_mse_target Distillation of the dbd mse bert model 1502778723
|
BertSinsigFresh
web_meta: 524
small-bert-model sbr:1721013142 predict Predict_sinsig_fresh_20200118_0502_1528253243_standartized_mse_target Distillation of the fresh bert model 1528253243
|
BertSinsigCEMult012
web_meta: 525
small-bert-model sbr:1721013142 predict Predict_sinsig_ce_multitarget_012_1511368337_standartized_mse_target Distillation of the sinsig ce multitarget bert model 1511368337, head 0.12 threshold
|
BertSinsigCEMult025
web_meta: 526
small-bert-model sbr:1721013142 predict Predict_sinsig_ce_multitarget_025_1511368337_standartized_mse_target Distillation of the sinsig ce multitarget bert model 1511368337, head 0.25 threshold
|
BertSinsigCEMult05
web_meta: 527
small-bert-model sbr:1721013142 predict Predict_sinsig_ce_multitarget_05_1511368337_standartized_mse_target Distillation of the sinsig ce multitarget bert model 1511368337, head 0.5 threshold
|
BertProximaMSE
web_meta: 528
small-bert-model sbr:1721013142 predict Predict_proxima_large_mse_1545302456_standartized_mse_target Distillation of the proxima mse bert model 1545302456
|
BertClickPersCE
web_meta: 529
small-bert-model sbr:1721013142 predict Predict_click_pers_ce_1534051635_standartized_mse_target Distillation of the click pers ce bert model 1534051635
|
BertClickOddMSE
web_meta: 530
small-bert-model sbr:1721013142 predict Predict_click_odd_mse_1512795366_standartized_mse_target Distillation of the click odd mse bert model 1512795366
|
SplitBertProximaMSE
web_meta: 531
split-bert-model sbr:2005741938 predict Predict_mse_prediction_1628966206_proxima_standartized Split distillation of the proxima mse bert model 1628966206
|
SplitBertSinsigCEMult012
web_meta: 532
split-bert-model sbr:2005741938 predict Predict_ce_0_12_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 0.12 threshold
|
SplitBertSinsigCEMult025
web_meta: 533
split-bert-model sbr:2005741938 predict Predict_mse_Predict_ce_0_25_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 0.25 threshold
|
SplitBertSinsigCEMult05
web_meta: 534
split-bert-model sbr:2005741938 predict Predict_mse_Predict_ce_0_5_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 0.5 threshold
|
SplitBertSinsigCEMult078
web_meta: 535
split-bert-model sbr:2005741938 predict mse_Predict_ce_0_78_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 0.78 threshold
|
SplitBertSinsigCEMult13
web_meta: 536
split-bert-model sbr:2005741938 predict Predict_mse_Predict_ce_1_3_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 1.3 threshold
|
SplitBertSinsigCEMult20
web_meta: 537
split-bert-model sbr:2005741938 predict Predict_mse_Predict_ce_2_0_1633847052_sinsig_standartized Split distillation of the sinsig ce multitarget bert model 1633847052, head 2.0 threshold
|
SplitBertProximaMseNoTextInTrain
web_meta: 538
split-bert-model sbr:2005741938 predict Predict_mse_prediction_1625097377_proxima_standartized Split distillation of the proxima mse bert model 1625097377
|
SplitBertDBDMSE
web_meta: 539
split-bert-model sbr:2005741938 predict Predict_mse_prediction_1633307071_dbd_standartized Split distillation of the dbd mse bert model 1633307071
|
SplitBertFresh
web_meta: 540
split-bert-model sbr:2005741938 predict prediction_1528253243_sinsig_fresh_20200118_0502_standartized Split distillation of the fresh bert model 1528253243
|
SplitBertIsPirate
web_meta: 541
split-bert-model sbr:2005741938 predict prediction_1644380449_is_pirate_0807_standartized Split distillation of the is_pirate bert model 1644380449
|
SplitBertNovSinsigMSE
web_meta: 542
split-bert-model sbr:2005741938 predict Predict_oct_large_1871529228_sinsig_relev10_mse_stand_mse Split distillation of the nov-large model with relev10 1871529228
|
SplitBertNovProximaMSE
web_meta: 543
split-bert-model sbr:2005741938 predict Predict_oct_large_1872243553_proxima_relev10_mse_stand_mse Split distillation of the nov-large model with relev10 1872243553
|
SplitBertXLLaVMSE
web_meta: 544
split-bert-model sbr:2005741938 predict Predict_xLarge_prediction_nov_lav_1990027087_standartized_mse Split distillation of the xlarge model with relev10 1990027087
|
SplitBertXLProximaMSE
web_meta: 545
split-bert-model sbr:2005741938 predict Predict_xLarge_prediction_nov_proxima_1970068024_standartized_mse Split distillation of the nov-large model with relev10 1970068024
|
SplitBertFinLawMSE
web_meta: 546
split-bert-model sbr:2005741938 predict Predict_Large_predict_target_fin_law_1967416198_standartized_mse Split distillation of the large model 1967416198
|
SplitBertMedMSE
web_meta: 547
split-bert-model sbr:2005741938 predict Predict_Large_predict_target_med_doc_1967182618_standartized_mse Split distillation of the large model 1967182618
|
SplitBertSosMSE
web_meta: 548
split-bert-model sbr:2005741938 predict Predict_Large_predict_target_sos_1969262884_standartized_mse Split distillation of the large model 1969262884
|
SplitBertXlLavPlatformMSE
web_meta: 549
sbr:2411087272 splitv3 head Predict_jul_xlarge_lav_with_platform_2344793612_stand_mse
|
SplitBertXlSinsigBasketsPlatformMse
web_meta: 550
sbr:2411087272 splitv3 head Predict_jul_xlarge_sinsig_baskets_with_platform_2339818891_stand_mse
|
SplitBertXlLavMergeCsTsTtPlatformMse
web_meta: 551
berts_storage:2022/02/artmironov/2831823537 splitv4 head Predict_bert_650_merge_platform_no_trans_v1_2784077050_stand_mse
|
CsBertXlSinsigBasketsMse
web_meta: 555
bert model 2021/09/boyalex/2405042847 head: Predict_jul_xlarge_sinsig_baskets_2339818891_stand_mse
|
CsBertCleverClicksMse
web_meta: 556
bert model 2021/09/boyalex/2405042847 head: Predict_large_cs_clever_clicks_2360931838_stand_mse
|
CsBertGooglePosMse
web_meta: 557
bert model 2021/09/boyalex/2405042847 head: Predict_large_google_2334139725_stand_mse
|
CsBertCsPQMse
web_meta: 558
bert model 2021/09/boyalex/2405042847 head: Predict_pq_large_2394064809_stand_mse
|
CsBertXlCsSinsigsMse
web_meta: 559
bert model 2021/09/boyalex/2405042847 head: Predict_jul_xlarge_cs_sinsig_full_2394990987_stand_mse
|
CsBertXlCsApscoreMse
web_meta: 560
bert model 2021/09/boyalex/2405042847 head: Predict_jul_xlarge_cs_apscore_2397363137_stand_mse
|