3.5. 指令信息的生成
选项-gen-instr-info使得TableGen根据TD文件的描述,生成目标机器的指令描述代码。
3.5.1. CodeGenSchedModels对象
一如既往,TableGen对这个选项的入口函数看起来很简单。参数RK就是保存了所有class与def定义的Record实例的容器。
608 void EmitInstrInfo(RecordKeeper&RK, raw_ostream &OS) {
609 InstrInfoEmitter(RK).run(OS);
610 EmitMapTable(RK, OS);
611 }
609行的InstrInfoEmitter是用于生成相关代码的外覆类,它的构造函数定义如下:
38 InstrInfoEmitter(RecordKeeper &R):
39 Records(R), CDP(R), SchedModels(CDP.getTargetInfo().getSchedModels()){}
CDP是一个CodeGenDAGPatterns实例,因此CDP(R)所引发的一系列处理正是前面一节所看过的过程,即从指令定义生成指令的DAG,但不涉及指令选择代码的生成。
InstrInfoEmitter的成员SchedModels是CodeGenSchedModels类型的引用,这个类定义了以下的数据成员。它是目标机器模型数据的顶层容器。
219 class CodeGenSchedModels {
220 RecordKeeper &Records;
221 constCodeGenTarget &Target;
222
223 // Map dagexpressions to Instruction lists.
224 SetTheory Sets;
225
226 // List of uniqueprocessor models.
227 std::vector<CodeGenProcModel>ProcModels;
228
229 // MapProcessor's MachineModel or ProcItin to a CodeGenProcModel index.
230 typedefDenseMap<Record*, unsigned> ProcModelMapTy;
231 ProcModelMapTy ProcModelMap;
232
233 // Per-operandSchedReadWrite types.
234 std::vector<CodeGenSchedRW>SchedWrites;
235 std::vector<CodeGenSchedRW> SchedReads;
236
237 // List of uniqueSchedClasses.
238 std::vector<CodeGenSchedClass>SchedClasses;
239
240 // Any inferredSchedClass has an index greater than NumInstrSchedClassses.
241 unsigned NumInstrSchedClasses;
242
243 // Map eachinstruction to its unique SchedClass index considering the
244 // combination ofit's itinerary class, SchedRW list, and InstRW records.
245 typedefDenseMap<Record*, unsigned> InstClassMapTy;
246 InstClassMapTy InstrClassMap;
CodeGenTarget::getSchedModels调用CodeGenSchedModels构造函数创建CodeGenSchedModels对象,参数RK也就是EmitInstrInfo的参数RK。
88 CodeGenSchedModels::CodeGenSchedModels(RecordKeeper&RK,
89 const CodeGenTarget &TGT):
90 Records(RK), Target(TGT) {
91
92 Sets.addFieldExpander("InstRW","Instrs");
93
94 // Allow Setevaluation to recognize the dags used in InstRW records:
95 // (instrs Op1,Op1...)
96 Sets.addOperator("instrs",llvm::make_unique<InstrsOp>());
97 Sets.addOperator("instregex",llvm::make_unique<InstRegexOp>(Target));
98
99 // Instantiate aCodeGenProcModel for each SchedMachineModel with the values
100 // that areexplicitly referenced in tablegen records. Resources associated
101 // with eachprocessor will be derived later. Populate ProcModelMap with the
102 //CodeGenProcModel instances.
103 collectProcModels();
104
105 // Instantiate aCodeGenSchedRW for each SchedReadWrite record explicitly
106 // defined, andpopulate SchedReads and SchedWrites vectors. Implicit
107 //SchedReadWrites that represent sequences derived from expanded variant will
108 // be inferredlater.
109 collectSchedRW();
110
111 // Instantiate aCodeGenSchedClass for each unique SchedRW signature directly
112 // required by aninstruction definition, and populate SchedClassIdxMap. Set
113 //NumItineraryClasses to the number of explicit itinerary classes referenced
114 // byinstructions. Set NumInstrSchedClasses to the number of itinerary
115 // classes plusany classes implied by instructions that derive from class
116 // Sched andprovide SchedRW list. This does not infer any new classes from
117 // SchedVariant.
118 collectSchedClasses();
119
120 // Findinstruction itineraries for each processor. Sort and populate
121 //CodeGenProcModel::ItinDefList. (Cycle-to-cycle itineraries). This requires
122 // all itineraryclasses to be discovered.
123 collectProcItins();
124
125 // Find ItinRWrecords for each processor and itinerary class.
126 // (Forper-operand resources mapped to itinerary classes).
127 collectProcItinRW();
128
129 // Infer newSchedClasses from SchedVariant.
130 inferSchedClasses();
131
132 // Populate each CodeGenProcModel'sWriteResDefs, ReadAdvanceDefs, and
133 //ProcResourceDefs.
134 collectProcResources();
135 }
首先,InstRW具有一个dag的成员Instrs,这是需要特别处理的。其次,还有两个特殊的dag操作符需要注意。一个是instrs,表示它的操作数应该解释为指令定义。另一个是instregex,表示它的操作数是用于匹配指令操作码名字的正则模式。
3.5.1.1. SchedMachineModel定义
CodeGenSchedModels构造函数首先处理SchedMachineModel定义。前面看到对Atom这样依赖Itinerary来描述指令执行的处理器,从SchedMachineModel派生的定义是这个描述最重要的部分。而对SandyBridge这样通过资源占用来描述指令执行的处理器,SchedMachineModel派生定义则通常给出处理器的全局信息。处理器在自己的Processor定义中指出所需的SchedMachineModel。因此,通过Processor定义找出所有使用的SchedMachineModel是很自然的的方式。140行首先通过方法LessRecordFieldName以字母序对所有的Processor定义排序。
138 void CodeGenSchedModels::collectProcModels(){
139 RecVec ProcRecords =Records.getAllDerivedDefinitions("Processor");
140 std::sort(ProcRecords.begin(),ProcRecords.end(), LessRecordFieldName());
141
142 // Reserve spacebecause we can. Reallocation would be ok.
143 ProcModels.reserve(ProcRecords.size()+1);
144
145 // Use idx=0 forNoModel/NoItineraries.
146 Record *NoModelDef =Records.getDef("NoSchedModel");
147 Record *NoItinsDef =Records.getDef("NoItineraries");
148 ProcModels.emplace_back(0,"NoSchedModel", NoModelDef, NoItinsDef);
149 ProcModelMap[NoModelDef] = 0;
150
151 // For each processor,find a unique machine model.
152 for (unsignedi = 0, N = ProcRecords.size(); i < N; ++i)
153 addProcModel(ProcRecords[i]);
154 }
CodeGenSchedModels容器ProcModels的类型是std::vector<CodeGenProcModel>。其中的类型CodeGenProcModel描述处理器的调度模型,它近似于TD文件的SchedMachineModel定义,它定义了如下的数据成员与构造函数。
174 struct CodeGenProcModel {
175 unsigned Index;
176 std::string ModelName;
177 Record *ModelDef;
178 Record *ItinsDef;
179
180 // Derivedmembers...
181
182 // Array ofInstrItinData records indexed by a CodeGenSchedClass index.
183 // This list isempty if the Processor has no value for Itineraries.
184 // Initialized bycollectProcItins().
185 RecVec ItinDefList;
186
187 // Map itineraryclasses to per-operand resources.
188 // This list isempty if no ItinRW refers to this Processor.
189 RecVec ItinRWDefs;
190
191 // All read/writeresources associated with this processor.
192 RecVec WriteResDefs;
193 RecVec ReadAdvanceDefs;
194
195 // Per-operandmachine model resources associated with this processor.
196 RecVec ProcResourceDefs;
197 RecVec ProcResGroupDefs;
198
199 CodeGenProcModel(unsignedIdx, const std::string &Name, Record *MDef,
200 Record *IDef) :
201 Index(Idx), ModelName(Name),ModelDef(MDef), ItinsDef(IDef) {}
collectProcModels 149行的ProcModelMap是DenseMap<Record*,unsigned>的typedef,用于将CodeGenProcModel对象在ProcModels容器的序号与对应调度模型的Record对象关联起来。第一个调度模型是缺省模型,它的地位相当于空指针。
这些CodeGenSchedModels实例由CodeGenSchedModels::addProcModel方法来获取。
158 void CodeGenSchedModels::addProcModel(Record*ProcDef) {
159 Record *ModelKey = getModelOrItinDef(ProcDef);
160 if (!ProcModelMap.insert(std::make_pair(ModelKey,ProcModels.size())).second)
161 return;
162
163 std::string Name = ModelKey->getName();
164 if(ModelKey->isSubClassOf("SchedMachineModel")) {
165 Record *ItinsDef =ModelKey->getValueAsDef("Itineraries");
166 ProcModels.emplace_back(ProcModels.size(),Name, ModelKey, ItinsDef);
167 }
168 else {
169 // An itineraryis defined without a machine model. Infer a new model.
170 if(!ModelKey->getValueAsListOfDefs("IID").empty())
171 Name = Name + "Model";
172 ProcModels.emplace_back(ProcModels.size(),Name,
173 ProcDef->getValueAsDef("SchedModel"), ModelKey);
174 }
175 DEBUG(ProcModels.back().dump());
176 }
方法getModelOrItinDef访问以下数据:Processor->SchedModel(类型SchedMachineModel)以及Processor->ProcItin(类型ProcessorItineraries)->IID(类型list<InstrItinData>)。
273 Record *getModelOrItinDef(Record*ProcDef) const {
274 Record *ModelDef =ProcDef->getValueAsDef("SchedModel");
275 Record *ItinsDef = ProcDef->getValueAsDef("ProcItin");
276 if(!ItinsDef->getValueAsListOfDefs("IID").empty()) {
277 assert(ModelDef->getValueAsBit("NoModel")
278 && "Itineraries mustbe defined within SchedMachineModel");
279 returnItinsDef;
280 }
281 returnModelDef;
282 }
从getModelOrItinDef定义可以看出,如果同时出现,后者的优先程度高于前者。它们分别生成CodeGenProcModel对象:(ModelDef:Processor->SchedModel, ItinsDef: Processor->SchedModel-> Itineraries),(ModelDef:Processor->ProcItin->IID, ItinsDef: Processor->ProcItin)。
其中Itineraries与Procltin的类型都是ProcessorItineraries定义(ItinsDef)。
3.5.1.2. SchedReadWrite的处理
接下来由CodeGenSchedModels::collectSchedRW方法为TD文件中定义的SchedReadWrite定义生成CodeGenSchedRW对象。TD文件中SchedReadWrite有复杂的派生体系与来源,需要一个一个处理。
204 void CodeGenSchedModels::collectSchedRW(){
205 // Reserve idx=0for invalid writes/reads.
206 SchedWrites.resize(1);
207 SchedReads.resize(1);
208
209 SmallPtrSet<Record*, 16> RWSet;
210
211 // Find allSchedReadWrites referenced by instruction defs.
212 RecVec SWDefs, SRDefs;
213 for (const CodeGenInstruction *Inst :Target.instructions()) {
214 Record *SchedDef = Inst->TheDef;
215 if(SchedDef->isValueUnset("SchedRW"))
216 continue;
217 RecVec RWs =SchedDef->getValueAsListOfDefs("SchedRW");
218 for(RecIter RWI = RWs.begin(), RWE = RWs.end(); RWI != RWE; ++RWI) {
219 if((*RWI)->isSubClassOf("SchedWrite"))
220 scanSchedRW(*RWI,SWDefs, RWSet);
221 else {
222 assert((*RWI)->isSubClassOf("SchedRead")&& "Unknown SchedReadWrite");
223 scanSchedRW(*RWI, SRDefs, RWSet);
224 }
225 }
226 }
227 // Find allReadWrites referenced by InstRW.
228 RecVec InstRWDefs =Records.getAllDerivedDefinitions("InstRW");
229 for (RecIterOI = InstRWDefs.begin(), OE = InstRWDefs.end(); OI != OE; ++OI) {
230 // For allOperandReadWrites.
231 RecVec RWDefs =(*OI)->getValueAsListOfDefs("OperandReadWrites");
232 for(RecIter RWI = RWDefs.begin(), RWE = RWDefs.end();
233 RWI != RWE; ++RWI) {
234 if((*RWI)->isSubClassOf("SchedWrite"))
235 scanSchedRW(*RWI,SWDefs, RWSet);
236 else {
237 assert((*RWI)->isSubClassOf("SchedRead")&& "Unknown SchedReadWrite");
238 scanSchedRW(*RWI, SRDefs, RWSet);
239 }
240 }
241 }
242 // Find allReadWrites referenced by ItinRW.
243 RecVec ItinRWDefs =Records.getAllDerivedDefinitions("ItinRW");
244 for (RecIterII = ItinRWDefs.begin(), IE = ItinRWDefs.end(); II != IE; ++II) {
245 // For allOperandReadWrites.
246 RecVec RWDefs =(*II)->getValueAsListOfDefs("OperandReadWrites");
247 for(RecIter RWI = RWDefs.begin(), RWE = RWDefs.end();
248 RWI != RWE; ++RWI) {
249 if((*RWI)->isSubClassOf("SchedWrite"))
250 scanSchedRW(*RWI,SWDefs, RWSet);
251 else {
252 assert((*RWI)->isSubClassOf("SchedRead")&& "Unknown SchedReadWrite");
253 scanSchedRW(*RWI, SRDefs, RWSet);
254 }
255 }
256 }
257 // Find allReadWrites referenced by SchedAlias. AliasDefs needs to be sorted
258 // for the loopbelow that initializes Alias vectors.
259 RecVec AliasDefs =Records.getAllDerivedDefinitions("SchedAlias");
260 std::sort(AliasDefs.begin(), AliasDefs.end(),LessRecord());
261 for (RecIterAI = AliasDefs.begin(), AE = AliasDefs.end(); AI != AE; ++AI) {
262 Record *MatchDef = (*AI)->getValueAsDef("MatchRW");
263 Record *AliasDef =(*AI)->getValueAsDef("AliasRW");
264 if(MatchDef->isSubClassOf("SchedWrite")) {
265 if(!AliasDef->isSubClassOf("SchedWrite"))
266 PrintFatalError((*AI)->getLoc(),"SchedWrite Alias must be SchedWrite");
267 scanSchedRW(AliasDef,SWDefs, RWSet);
268 }
269 else {
270 assert(MatchDef->isSubClassOf("SchedRead")&& "Unknown SchedReadWrite");
271 if(!AliasDef->isSubClassOf("SchedRead"))
272 PrintFatalError((*AI)->getLoc(),"SchedRead Alias must be SchedRead");
273 scanSchedRW(AliasDef, SRDefs, RWSet);
274 }
275 }
276 // Sort and addthe SchedReadWrites directly referenced by instructions or
277 // itineraryresources. Index reads and writes in separate domains.
278 std::sort(SWDefs.begin(), SWDefs.end(),LessRecord());
279 for (RecIterSWI = SWDefs.begin(), SWE = SWDefs.end(); SWI != SWE; ++SWI) {
280 assert(!getSchedRWIdx(*SWI,/*IsRead=*/false) && "duplicateSchedWrite");
281 SchedWrites.emplace_back(SchedWrites.size(), *SWI);
282 }
283 std::sort(SRDefs.begin(), SRDefs.end(),LessRecord());
284 for (RecIterSRI = SRDefs.begin(), SRE = SRDefs.end(); SRI != SRE; ++SRI) {
285 assert(!getSchedRWIdx(*SRI,/*IsRead-*/true) && "duplicateSchedWrite");
286 SchedReads.emplace_back(SchedReads.size(),*SRI);
287 }
288 // InitializeWriteSequence vectors.
289 for(std::vector<CodeGenSchedRW>::iterator WI = SchedWrites.begin(),
290 WE = SchedWrites.end(); WI != WE;++WI) {
291 if (!WI->IsSequence)
292 continue;
293 findRWs(WI->TheDef->getValueAsListOfDefs("Writes"),WI->Sequence,
294 /*IsRead=*/false);
295 }
296 // InitializeAliases vectors.
297 for (RecIterAI = AliasDefs.begin(), AE = AliasDefs.end(); AI != AE; ++AI) {
298 Record *AliasDef = (*AI)->getValueAsDef("AliasRW");
299 getSchedRW(AliasDef).IsAlias = true;
300 Record *MatchDef =(*AI)->getValueAsDef("MatchRW");
301 CodeGenSchedRW &RW =getSchedRW(MatchDef);
302 if (RW.IsAlias)
303 PrintFatalError((*AI)->getLoc(),"Cannot Alias an Alias");
304 RW.Aliases.push_back(*AI);
305 }
306 DEBUG(
307 for(unsigned WIdx = 0, WEnd = SchedWrites.size(); WIdx != WEnd; ++WIdx) {
308 dbgs() << WIdx << ":";
309 SchedWrites[WIdx].dump();
310 dbgs() << '\n';
311 }
312 for(unsigned RIdx = 0, REnd = SchedReads.size(); RIdx != REnd; ++RIdx) {
313 dbgs() << RIdx << ":";
314 SchedReads[RIdx].dump();
315 dbgs() << '\n';
316 }
317 RecVec RWDefs =Records.getAllDerivedDefinitions("SchedReadWrite");
318 for(RecIter RI = RWDefs.begin(), RE = RWDefs.end();
319 RI != RE; ++RI) {
320 if (!getSchedRWIdx(*RI,(*RI)->isSubClassOf("SchedRead"))) {
321 conststd::string &Name = (*RI)->getName();
322 if (Name != "NoWrite"&& Name != "ReadDefault")
323 dbgs() << "UnusedSchedReadWrite " << (*RI)->getName() << '\n';
324 }
325 });
326 }
SchedReadWrite各派生类型所对应的对象都是CodeGenSchedRW,它是这样的一个定义:
46 struct CodeGenSchedRW {
47 unsigned Index;
48 std::string Name;
49 Record *TheDef;
50 bool IsRead;
51 bool IsAlias;
52 bool HasVariants;
53 bool IsVariadic;
54 bool IsSequence;
55 IdxVec Sequence;
56 RecVec Aliases;
57
58 CodeGenSchedRW()
59 : Index(0), TheDef(nullptr), IsRead(false),IsAlias(false),
60 HasVariants(false), IsVariadic(false),IsSequence(false) {}
61 CodeGenSchedRW(unsigned Idx, Record *Def)
62 : Index(Idx), TheDef(Def), IsAlias(false),IsVariadic(false) {
63 Name= Def->getName();
64 IsRead =Def->isSubClassOf("SchedRead");
65 HasVariants =Def->isSubClassOf("SchedVariant");
66 if (HasVariants)
67 IsVariadic =Def->getValueAsBit("Variadic");
68
69 // Read recordsdon't currently have sequences, but it can be easily
70 // added. Notethat implicit Reads (from ReadVariant) may have a Sequence
71 // (but norecord).
72 IsSequence =Def->isSubClassOf("WriteSequence");
73 }
74
75 CodeGenSchedRW(unsigned Idx, bool Read, const IdxVec &Seq,
76 conststd::string &Name)
77 : Index(Idx), Name(Name), TheDef(nullptr),IsRead(Read), IsAlias(false),
78 HasVariants(false), IsVariadic(false),IsSequence(true), Sequence(Seq) {
79 assert(Sequence.size()> 1 && "implied sequence needs >1 RWs");
80 }
81
82 bool isValid() const{
83 assert((!HasVariants|| TheDef) && "Variant write needs record def");
84 assert((!IsVariadic|| HasVariants) && "Variadic write needs variants");
85 assert((!IsSequence|| !HasVariants) && "Sequence can't have variant");
86 assert((!IsSequence|| !Sequence.empty()) && "Sequence should be nonempty");
87 assert((!IsAlias|| Aliases.empty()) && "Alias cannot have aliases");
88 returnTheDef || !Sequence.empty();
89 }
90
91 #ifndef NDEBUG
92 void dump() const;
93 #endif
94 };
其中类型IdxVec是std::vector<unsigned>的typdef,类型RecVec是std::vector<Record*>的typedef。
CodeGenSchedModels定义了容器SchedWrites与SchedReads来保存CodeGenSchedRW对象,47行的成员index就是CodeGenSchedRW对象在这些容器里的索引。
collectSchedRW首先scanSchedRW方法将SchedRead或SchedWrites定义记录在临时容器RWSet里。
被使用的SchedRead或SchedWrites定义通过scanSchedRW方法记录在参数RWSet容器里。SchedRead或SchedWrites定义被用在:Instruction->SchedRW,InstRW->OperandReadWrites,ItinRW->OperandReadWrites及SchedAlias->AliasRW。
179 static void scanSchedRW(Record *RWDef, RecVec &RWDefs,
180 SmallPtrSet<Record*,16> &RWSet) {
181 if (!RWSet.insert(RWDef).second)
182 return;
183 RWDefs.push_back(RWDef);
184 // Reads don't currenthave sequence records, but it can be added later.
185 if(RWDef->isSubClassOf("WriteSequence")) {
186 RecVec Seq =RWDef->getValueAsListOfDefs("Writes");
187 for(RecIter I = Seq.begin(), E = Seq.end(); I != E; ++I)
188 scanSchedRW(*I, RWDefs, RWSet);
189 }
190 else if(RWDef->isSubClassOf("SchedVariant")) {
191 // Visit eachvariant (guarded by a different predicate).
192 RecVec Vars =RWDef->getValueAsListOfDefs("Variants");
193 for(RecIter VI = Vars.begin(), VE = Vars.end(); VI != VE; ++VI) {
194 // Visit eachRW in the sequence selected by the current variant.
195 RecVec Selected =(*VI)->getValueAsListOfDefs("Selected");
196 for(RecIter I = Selected.begin(), E = Selected.end(); I != E; ++I)
197 scanSchedRW(*I, RWDefs, RWSet);
198 }
199 }
200 }
因为WriteSequence与SchedVariant还包含list<SchedWrite>类型的成员(即Writes与Variants),这些SchedWrite定义也需要包括进来。临时容器SWDefs保存的是这些SchedWrite定义的Record对象,类似的SchedRead定义的Record对象则记录在SRDefs中。
在获得所有被使用的SchedRead与SchedWrite定义后,在collectSchedRW的278与283行将这些Record对象按名字排序。然后在279与284行循环对这些的Record对象构建对应的CodeGenSchedRW对象,并保存在SchedWrites与SchedReads容器里。
只有从WriteSequence定义产生的CodeGenSchedRW对象的IsSequence是true,它援引一组重复指定次数的SchedWrite定义。我们需要知道这些SchedWrite定义对应的CodeGenSchedRW对象。这里通过CodeGenSchedModels::findRWs方法,将这些CodeGenSchedRW对象在容器中的索引记录在这些WriteSequence定义的CodeGenSchedRW对象的Sequence容器里。
392 void CodeGenSchedModels::findRWs(const RecVec &RWDefs, IdxVec &RWs,
393 bool IsRead) const {
394 for (RecIterRI = RWDefs.begin(), RE = RWDefs.end(); RI != RE; ++RI) {
395 unsigned Idx = getSchedRWIdx(*RI, IsRead);
396 assert(Idx&& "failed to collect SchedReadWrite");
397 RWs.push_back(Idx);
398 }
399 }
getSchedRWIdx的参数After缺省为0,指定SchedReads或SchedWrites容器开始查找的位置。它返回参数Def的CodeGenSchedRW对象在SchedReads或SchedWrites容器中的索引。
340 unsigned CodeGenSchedModels::getSchedRWIdx(Record*Def, bool IsRead,
341 unsigned After) const {
342 conststd::vector<CodeGenSchedRW> &RWVec = IsRead ? SchedReads :SchedWrites;
343 assert(After< RWVec.size() && "start position out of bounds");
344 for(std::vector<CodeGenSchedRW>::const_iterator I = RWVec.begin() + After,
345 E = RWVec.end(); I != E; ++I) {
346 if (I->TheDef == Def)
347 return I- RWVec.begin();
348 }
349 return 0;
350 }
SchedAlias将MatchRW的别名定义为AliasRW,因此MatchRW的CodeGenSchedRW对象的Aliases容器将记录下AliasRW的CodeGenSchedRW对象(297行循环)。