Choosing between Competing Experimental Designs

Slides:



Advertisements
Παρόμοιες παρουσιάσεις
Προβλέψεις με τη χρήση προτύπων γραμμικής παλινδρόμησης και συσχέτισης
Advertisements

1 Please include the following information on this slide: Παρακαλώ, συμπεριλάβετε τις παρακάτω πληροφoρίες στη διαφάνεια: Name Giannakodimou Aliki Kourkouta.
ΗΥ Παπαευσταθίου Γιάννης1 Clock generation.
6/26/2015HY220: Ιάκωβος Μαυροειδής1 HY220 Asynchronous Circuits.
Week 11 Quiz Sentence #2. The sentence. λαλο ῦ μεν ε ἰ δότες ὅ τι ὁ ἐ γείρας τ ὸ ν κύριον Ἰ ησο ῦ ν κα ὶ ἡ μ ᾶ ς σ ὺ ν Ἰ ησο ῦ ἐ γερε ῖ κα ὶ παραστήσει.
WRITING B LYCEUM Teacher Eleni Rossidou ©Υπουργείο Παιδείας και Πολιτισμού.
Πολυώνυμα και Σειρές Taylor 1. Motivation Why do we use approximations? –They are made up of the simplest functions – polynomials. –We can differentiate.
Lesson 6c: Around the City I JSIS E 111: Elementary Modern Greek Sample of modern Greek alphabet, M. Adiputra,
Medilab.pme.duth.gr Δρ. Π. Ν. Μπότσαρης 1 ΔΗΜΟΚΡΙΤΕΙΟ ΠΑΝΕΠΙΣΤΗΜΕΙΟ ΘΡΑΚΗΣ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ ΠΑΡΑΓΩΓΗΣ κ ΔΙΟΙΚΗΣΗΣ ΤΟΜΕΑΣ ΥΛΙΚΩΝ, ΔΙΕΡΓΑΣΙΩΝ.
ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ Εισαγωγή στην Ανθρωπολογία της Τέχνης Η έννοια της Τέχνης (What is art 1) Διδάσκων: Καθηγητής Χρήστος.
 Το παρόν εκπαιδευτικό υλικό υπόκειται σε άδειες χρήσης Creative Commons.  Για εκπαιδευτικό υλικό, όπως εικόνες, που υπόκειται σε άλλου τύπου άδειας.
Προσομοίωση Δικτύων 3η Άσκηση Δημιουργία, διαμόρφωση μελέτη σύνθετων τοπολογιών.
Time Management Matrix Assignment Submitted By Safwan Zubair October 21, 2013 BUS Contemporary Business Practice Professor Nankin.
Αριθμητική Επίλυση Διαφορικών Εξισώσεων 1. Συνήθης Δ.Ε. 1 ανεξάρτητη μεταβλητή x 1 εξαρτημένη μεταβλητή y Καθώς και παράγωγοι της y μέχρι n τάξης, στη.
ΕΥΡΩΠΑΪΚΑ ΣΧΟΛΕΙΑ. SCHOOLS OF EUROPEAN EDUCATION.
1 ΔΗΜΟΠΑΘΟΛΟΓΙΑ ΤΗΣ ΔΙΑΤΡΟΦΗΣ ΠΑΡΑΔΟΣΗ 1Οη (Θ) Στοιχεία Επαγωγικής Στατιστικής.
Ψηφιακά Παιχνίδια και μάθηση Δρ. Νικολέτα Γιαννούτσου Εργαστήριο Εκπαιδευτικής Τεχνολογίας.
Διαχείριση Διαδικτυακής Φήμης! Do the Online Reputation Check! «Ημέρα Ασφαλούς Διαδικτύου 2015» Ε. Κοντοπίδη, ΠΕ19.
Introduction to Latent Variable Models. A comparison of models X1X1 X2X2 X3X3 Y1Y1 δ1δ1 δ2δ2 δ3δ3 Model AModel B ξ1ξ1 X1X1 X2X2 X3X3 δ1δ1 δ2δ2 δ3δ3.
From Applying Theory to Theorising Practice Achilleas Kostoulas Epirus Institute of Technology.
Guide to Business Planning The Value Chain © Guide to Business Planning A principal use of value chain analysis is to identify a strategy mismatch between.
Guide to Business Planning The Value System © Guide to Business Planning The “value system” is also referred to as the “industry value chain”. In contrast.
Μαθαίνω με “υπότιτλους”
Αντισταθμιστική ανάλυση
Relations Chapter 9.
Αντικειμενοστραφής Προγραμματισμός ΙΙ
Ερωτήσεις –απαντήσεις Ομάδων Εργασίας
Φάσμα παιδαγωγικής ανάπτυξης
JSIS E 111: Elementary Modern Greek
Matrix Analytic Techniques
Ψηφιακeς ιδEες και αξIες
Differential sensitivity (difference threshold)
ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ
Υδρόβια Φυτά Θεοφανώ Κούλεντρου Rippling Water (Basic)
φίλτρα IIR (Infinite Impulse Response)
Return and Risk in a Portfolio Setup
ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ
ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ
ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ
Example Rotary Motion Problems
2013 edition Wilfred E. Major
Μία πρακτική εισαγωγή στην χρήση του R
Postgraduate Courses related to Clinical Criminology and Legal Psychology - Italy WE CAN – ΜΠΟΡΟΥΜΕ! Cyberbullying – Κυβερνοεκφοβισμός Δίκτυο Δράσης για.
ΥΠΟΥΡΓΕΙΟ ΠΑΙΔΕΙΑΣ ΚΑΙ ΠΟΛΙΤΙΣΜΟΥ
Solving Trig Equations
Find: φ σ3 = 400 [lb/ft2] CD test Δσ = 1,000 [lb/ft2] Sand 34˚ 36˚ 38˚
aka Mathematical Models and Applications
GLY 326 Structural Geology
Find: angle of failure, α
ΕΝΣΤΑΣΕΙΣ ΠΟΙΟΣ? Όμως ναι.... Ένα σκάφος
Find: minimum B [ft] γcon=150 [lb/ft3] γT=120 [lb/ft3] Q φ=36˚
Choosing between Competing Experimental Designs
νλμ : The Computational Content of Classical Natural Deduction
Find: ρc [in] from load γT=110 [lb/ft3] γT=100 [lb/ft3]
Find: ρc [in] from load γT=106 [lb/ft3] γT=112 [lb/ft3]
ΑΝΟΡΓΑΝΗ & ΑΝΑΛΥΤΙΚΗ ΧΗΜΕΙΑ
Find: σ1 [kPa] for CD test at failure
Βάλια Τόλιου, Registry Manager for Greece
τ [lb/ft2] σ [lb/ft2] Find: c in [lb/ft2] σ1 = 2,000 [lb/ft2]
Financial Market Theory
ΜΕΤΑΦΡΑΣΗ ‘ABC of Selling’. ΤΟ ΑΛΦΑΒΗΤΑΡΙ ΤΩΝ ΠΩΛΗΣΕΩΝ
Find: Force on culvert in [lb/ft]
Λέξεις που αλλάζουν νόημα αν είναι μετρήσιμα ή μη μετρήσιμα
Deriving the equations of
Variable-wise and Term-wise Recentering
David Evans University of Queensland
Find: ρc [in] from load (4 layers)
Erasmus + An experience with and for refugees Fay Pliagou.
Μεταγράφημα παρουσίασης:

Choosing between Competing Experimental Designs 2/7/2011 Copyright © 2011 Dan Nettleton

Design 1 for Comparing Two Treatments 2 1 2 1 2 1 2

Design 2 for Comparing Two Treatments 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

Design 1: Mixed Linear Model for the Observations from a Single Gene Y1111=μ+τ1+δ1+s1+b11+e1111 Y1221=μ+τ1+δ2+s2+b11+e1221 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2121=μ+τ2+δ1+s2+b21+e2121 Y1132=μ+τ1+δ1+s3+b12+e1132 Y1242=μ+τ1+δ2+s4+b12+e1242 Y2232=μ+τ2+δ2+s3+b22+e2232 Y2142=μ+τ2+δ1+s4+b22+e2142 Y1153=μ+τ1+δ1+s5+b13+e1153 Y1263=μ+τ1+δ2+s6+b13+e1263 Y2253=μ+τ2+δ2+s5+b23+e2253 Y2163=μ+τ2+δ1+s6+b23+e2163 Y1174=μ+τ1+δ1+s7+b14+e1174 Y1284=μ+τ1+δ2+s8+b14+e1284 Y2274=μ+τ2+δ2+s7+b24+e2274 Y2184=μ+τ2+δ1+s8+b24+e2184 Yijkl=μ+τi+δj+sk+bil+eijkl

Design 2: Mixed Linear Model for the Observations from a Single Gene Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 Y1133=μ+τ1+δ1+s3+b13+e1133 Y1244=μ+τ1+δ2+s4+b14+e1244 Y2233=μ+τ2+δ2+s3+b23+e2233 Y2144=μ+τ2+δ1+s4+b24+e2144 Y1155=μ+τ1+δ1+s5+b15+e1155 Y1266=μ+τ1+δ2+s6+b16+e1266 Y2255=μ+τ2+δ2+s5+b25+e2255 Y2166=μ+τ2+δ1+s6+b26+e2166 Y1177=μ+τ1+δ1+s7+b17+e1177 Y1288=μ+τ1+δ2+s8+b18+e1288 Y2277=μ+τ2+δ2+s7+b27+e2277 Y2188=μ+τ2+δ1+s8+b28+e2188 Yijkl=μ+τi+δj+sk+bil+eijkl

Design 2: Mixed Linear Model for the Observations from a Single Gene Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 Y1133=μ+τ1+δ1+s3+b13+e1133 Y1244=μ+τ1+δ2+s4+b14+e1244 Y2233=μ+τ2+δ2+s3+b23+e2233 Y2144=μ+τ2+δ1+s4+b24+e2144 Y1155=μ+τ1+δ1+s5+b15+e1155 Y1266=μ+τ1+δ2+s6+b16+e1266 Y2255=μ+τ2+δ2+s5+b25+e2255 Y2166=μ+τ2+δ1+s6+b26+e2166 Y1177=μ+τ1+δ1+s7+b17+e1177 Y1288=μ+τ1+δ2+s8+b18+e1288 Y2277=μ+τ2+δ2+s7+b27+e2277 Y2188=μ+τ2+δ1+s8+b28+e2188 Note that b and e are completely confounded in Design 2. Thus we would use only one random residual term for both factors, but we write the terms separately here for the sake of comparison with Design 1.

Test of Interest H0 : τ1 = τ2 vs. HA : τ1 = τ2 Equivalent to H0 : τ1 - τ2 = 0 vs. HA : τ1 - τ2 = 0 We estimate τ1 - τ2 by Y1...-Y2... Y1...-Y2...=τ1 - τ2 + b1.- b2. + e1... - e2...

Design 1 Estimate of τ1-τ2 Y1111=μ+τ1+δ1+s1+b11+e1111 Y1221=μ+τ1+δ2+s2+b11+e1221 Average over 4 effects Y2211=μ+τ2+δ2+s1+b21+e2211 Y2121=μ+τ2+δ1+s2+b21+e2121 Y1132=μ+τ1+δ1+s3+b12+e1132 Y1242=μ+τ1+δ2+s4+b12+e1242 Y2232=μ+τ2+δ2+s3+b22+e2232 Y2142=μ+τ2+δ1+s4+b22+e2142 Y1153=μ+τ1+δ1+s5+b13+e1153 Y1263=μ+τ1+δ2+s6+b13+e1263 Y2253=μ+τ2+δ2+s5+b23+e2253 Y2163=μ+τ2+δ1+s6+b23+e2163 Y1174=μ+τ1+δ1+s7+b14+e1174 Y1284=μ+τ1+δ2+s8+b14+e1284 Y2274=μ+τ2+δ2+s7+b24+e2274 Y2184=μ+τ2+δ1+s8+b24+e2184 Y1...=μ+τ1+δ.+s.+b1.+e1... Y2...=μ+τ2+δ.+s.+b2.+e2... Y1... - Y2... = τ1 - τ2 + b1.- b2. + e1... - e2...

Design 2 Estimate of τ1-τ2 Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Average over 8 effects Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 Y1133=μ+τ1+δ1+s3+b13+e1133 Y1244=μ+τ1+δ2+s4+b14+e1244 Y2233=μ+τ2+δ2+s3+b23+e2233 Y2144=μ+τ2+δ1+s4+b24+e2144 Y1155=μ+τ1+δ1+s5+b15+e1155 Y1266=μ+τ1+δ2+s6+b16+e1266 Y2255=μ+τ2+δ2+s5+b25+e2255 Y2166=μ+τ2+δ1+s6+b26+e2166 Y1177=μ+τ1+δ1+s7+b17+e1177 Y1288=μ+τ1+δ2+s8+b18+e1288 Y2277=μ+τ2+δ2+s7+b27+e2277 Y2188=μ+τ2+δ1+s8+b28+e2188 Y1...=μ+τ1+δ.+s.+b1.+e1... Y2...=μ+τ2+δ.+s.+b2.+e2... Y1... - Y2... = τ1 - τ2 + b1.- b2. + e1... - e2...

Variance of the Estimated Difference Y1... - Y2... = τ1 - τ2 + b1.- b2. + e1... - e2... Var(Y1.. - Y2..) = Var(b1.- b2. + e1... - e2...) For Design 1: σb + σb + σe + σe = 2σb+σe For Design 2: σb + σb + σe + σe = σb+σe 2 2 2 2 2 2 4 8 4 Design 1 variance is never smaller than Design 2 variance. 2 2 2 2 2 2 8 4

Design 2 is Preferred over Design 1 The variance of the estimated treatment difference for Design 2 will always be lower than or equal to the variance for Design 1. The standard errors for Design 2 will tend to smaller than for Design 1. The t-statistics for Design 2 will tend to be more extreme than the t-statistics for Design 1 when genes are truly differentially expressed. The p-values for differentially expressed genes will tend be smaller with Design 2 than with Design 1. Design 2 has more power for detecting differential expression than Design 1.

Some General Microarray Experimental Design Advice Use as much biological replication as is affordable. If the number of microarray slides or GeneChips is the limiting factor, measure each sample only once. Measuring any one sample more than once reduces the degree of biological replication that is possible, and this reduces the power to detect differential expression. If the number of biological replications is the limiting factor, measuring each experimental unit multiple times can improve precision, but this technical replication is no substitute for biological replication.

For example, Design A > Design B > Design C 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

An Analysis Based on Red – Green Differences Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 d1 d2 Y1133=μ+τ1+δ1+s3+b13+e1133 Y1244=μ+τ1+δ2+s4+b14+e1244 Y2233=μ+τ2+δ2+s3+b23+e2233 Y2144=μ+τ2+δ1+s4+b24+e2144 d3 d4 Y1155=μ+τ1+δ1+s5+b15+e1155 Y1266=μ+τ1+δ2+s6+b16+e1266 Y2255=μ+τ2+δ2+s5+b25+e2255 Y2166=μ+τ2+δ1+s6+b26+e2166 d5 d6 Y1177=μ+τ1+δ1+s7+b17+e1177 Y1288=μ+τ1+δ2+s8+b18+e1288 Y2277=μ+τ2+δ2+s7+b27+e2277 Y2188=μ+τ2+δ1+s8+b28+e2188 d7 d8

An Analysis Based on Red – Green Differences Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 d1 d2 Y2211-Y1111=(μ+τ2+δ2+s1+b21+e2211)-(μ+τ1+δ1+s1+b11+e1111) Y2211-Y1111=δ2-δ1+τ2-τ1+b21-b11+e2211-e1111 d1 = μ1 + r1 Y1222-Y2122=δ2-δ1+τ1-τ2+b12-b22+e1222-e2122 d2 = μ2 + r2

Test for Differential Expression Using a 2-Sample t-Test d1 = μ1 + r1 d2 = μ2 + r2 d3 = μ1 + r3 d4 = μ2 + r4 d5 = μ1 + r5 d6 = μ2 + r6 d7 = μ1 + r7 d8 = μ2 + r8 μ1= δ2-δ1+τ2-τ1 μ2= δ2-δ1+τ1-τ2 μ1 = μ2 δ2-δ1+τ2-τ1 = δ2-δ1+τ1-τ2 τ2-τ1 = τ1-τ2 τ1 = τ2 A 2-sample t-test of H0:μ1=μ2 using d1,d3,d5,d7 as one sample and d2,d4,d6,d8 as the other sample is a test of H0: τ1=τ2.

An Analysis Based on Red – Green Differences Y1111=μ+τ1+δ1+s1+b11+e1111 Y1222=μ+τ1+δ2+s2+b12+e1222 Y2211=μ+τ2+δ2+s1+b21+e2211 Y2122=μ+τ2+δ1+s2+b22+e2122 d1 d2 Y2211-Y1111=(μ+τ2+δ2+s1+b21+e2211)-(μ+τ1+δ1+s1+b11+e1111) Y2211-Y1111=δ2-δ1+τ2-τ1+b21-b11+e2211-e1111 Y2211-Y1111=δ2-δ1+(τ1-τ2)(-1)+b21-b11+e2211-e1111 Note that test of interest becomes H0 : β1 = 0 vs. HA : β1 = 0. d1 = β0 + β1 * x1 + r1 Y1222-Y2122=δ2-δ1+(τ1-τ2)(1)+b12-b22+e1222-e2122 d2 = β0 + β1 * x2 + r2

Test for Differential Expression Using Simple Linear Regression d1 = β0 + β1 x1 + r1 d2 = β0 + β1 x2 + r2 d3 = β0 + β1 x3 + r3 d4 = β0 + β1 x4 + r4 d5 = β0 + β1 x5 + r5 d6 = β0 + β1 x6 + r6 d7 = β0 + β1 x7 + r7 d8 = β0 + β1 x8 + r8

Test for Differential Expression Using Simple Linear Regression d1 = β0 + β1(-1) + r1 d2 = β0 + β1(1) + r2 d3 = β0 + β1(-1) + r3 d4 = β0 + β1(1) + r4 d5 = β0 + β1(-1) + r5 d6 = β0 + β1(1) + r6 d7 = β0 + β1(-1) + r7 d8 = β0 + β1(1) + r8

Same Equations in Matrix Form d1 1 -1 β0 d2 1 1 β1 d3 1 -1 d4 = 1 1 d5 1 -1 d6 1 1 d7 1 -1 d8 1 1 r1 r2 r3 r4 r5 r6 r7 r8 + d = Xβ + r

Results from Multiple Regression Note that the elements of r are independent and normally distributed with variance σ = 2σb + 2σe according the our original linear model. Thus The best linear unbiased estimator of β is β =(X’X)-1 X’d, β is normally distributed with mean β and var σ2(X’X)-1, σ2 = (d-Xβ)’(d-Xβ)/(n-p) is an unbiased estimator of σ2 where n=length of d and p=length of β, and t=(g’β - g’β)/(σ2g’(X’X)-1g)0.5 ~ t with n-p d.f. 2 2 2 ^ ^ ^ ^ ^ ^ ^

Results from Multiple Regression for Our Example 8 0 0 8 1/8 0 0 1/8 (X’X)= (X’X)-1 = d1+d2+d3+d4+d5+d6+d7+d8 -d1+d2-d3+d4-d5+d6-d7+d8 Y.2..-Y.1.. Y1...-Y2... X’d = = Y.2..-Y.1.. Y1...-Y2... ^ β = (X’X)-1 X’d = σb+σe 4 2 2 ^ 1/8 0 0 1/8 1/8 0 0 1/8 2 2 Var(β) = σ2 =(2σb+2σe) 2 2 =

A 2 x 2 Factorial Experiment Suppose we wish to study gene expression in chickens in response to Salmonella infection. We have only 6 two-color microarray slides and 12 chickens to work with. We wish to consider two factors: - infection type : mock or Salmonella - breed : resistant (r) or susceptible (s)

There are 4 Possible Treatments Treatments Alternative ways to code their means mock r μ1 μ+α1+β1+(αβ)11 μ+α+β+θ mock s μ2 μ+α1+β2+(αβ)12 μ+β Salmonella r μ3 μ+α2+β1+(αβ)21 μ+α Salmonella s μ4 μ+α2+β2+(αβ)22 μ

Normalized Log Expression Tests of Interest μ+α 10 Is mean expression under Salmonella infection different between the resistant and susceptible breeds? resistant μ+α+β+θ Normalized Log Expression 5 susceptible μ μ+β mock Salmonella Infection

Normalized Log Expression Is the difference that we see between the breeds the same under Salmonella infection as it is under mock infection? μ+α 10 resistant μ+α+β+θ Normalized Log Expression 5 ?=? susceptible μ μ+β mock Salmonella Infection

Normalized Log Expression Does the difference between mock and Salmonella infected chickens depend on the breed? μ+α 10 resistant μ+α+β+θ Normalized Log Expression 5 ?=? susceptible μ μ+β mock Salmonella Infection

Normalized Log Expression Write the null hypothesis for each test of interest in terms of the treatment mean parameters. μ+α 10 resistant μ+α+β+θ Normalized Log Expression 5 susceptible μ μ+β mock Salmonella Infection

How would you assign treatments to chickens and pair chickens on slides to best answer our questions of interest? Recall that we have 12 chickens, 6 slides, and 4 treatments.

A Comparison of Competing Designs μ+α+β+θ μ+α μ μ+β

A Comparison of Competing Designs μ+α+β+θ μ+α μ μ+β

Red – Green Differences for Design 1 μ+α+β+θ 1 μ+α Slide Mean Difference 1 δ2-δ1-β-θ 2 δ2-δ1-α 3 δ2-δ1+β 4 δ2-δ1+α+θ 5 δ2-δ1-α-β-θ 6 δ2-δ1-α+β 5 6 2 4 μ μ+β 3

X Matrix for Design 1 X β Slide Mean Difference 1 δ2-δ1-β-θ 2 δ2-δ1-α 1 δ2-δ1-β-θ 2 δ2-δ1-α 3 δ2-δ1+β 4 δ2-δ1+α+θ 5 δ2-δ1-α-β-θ 6 δ2-δ1-α+β 0 -1 -1 1 -1 0 0 1 0 1 0 1 1 0 1 1 -1 -1 -1 1 -1 1 0 δ2-δ1 α β θ

Red – Green Differences for Design 2 Slide Mean Difference 1 δ2-δ1-β-θ 2 δ2-δ1-α 3 δ2-δ1+β 4 δ2-δ1+α+θ 5 δ2-δ1-α-θ 6 δ2-δ1+α μ+α+β+θ 1 μ+α 2 4 5 6 μ μ+β 3

X Matrix for Design 2 X β Slide Mean Difference 1 δ2-δ1-β-θ 2 δ2-δ1-α 1 δ2-δ1-β-θ 2 δ2-δ1-α 3 δ2-δ1+β 4 δ2-δ1+α+θ 5 δ2-δ1-α-θ 6 δ2-δ1+α 1 0 -1 -1 1 -1 0 0 1 0 1 0 1 1 0 1 1 -1 0 -1 1 1 0 0 δ2-δ1 α β θ

Comparing Variances of the Competing Designs δ2-δ1 α β θ Recall that our tests of interest are H0 : α = 0 and H0 : θ = 0. The variance of our estimator of β is given by σ2(X’X)-1. β = (X’X)-1 for Design 1 (X’X)-1 for Design 2 0.1875 -0.0625 -0.0625 0.125 -0.0625 0.4375 0.1875 -0.375 -0.0625 0.1875 0.6875 -0.375 0.1250 -0.3750 -0.3750 0.750 0.2 0.10 0.00 0.0 0.1 0.55 0.25 -0.5 0.0 0.25 0.50 -0.5 0.0 -0.50 -0.50 1.0 Design 2 has a lower variance for the estimator of α.

Comparing Variances of the Competing Designs δ2-δ1 α β θ Recall that our tests of interest are H0 : α = 0 and H0 : θ = 0. The variance of our estimator of β is given by σ2(X’X)-1. β = (X’X)-1 for Design 1 (X’X)-1 for Design 2 0.1875 -0.0625 -0.0625 0.125 -0.0625 0.4375 0.1875 -0.375 -0.0625 0.1875 0.6875 -0.375 0.1250 -0.3750 -0.3750 0.750 0.2 0.10 0.00 0.0 0.1 0.55 0.25 -0.5 0.0 0.25 0.50 -0.5 0.0 -0.50 -0.50 1.0 Design 2 has a lower variance for the estimator of θ.

Dominance Design 2 is said to dominate Design 1 with respect to the tests of interest because the variances of the parameter estimators are lower for each parameter of interest when using Design 2 as compared to Design 1. Let vik denote the variance of the estimator of the kth parameter of interest using Design i. Design 2 is said to dominate Design 1 if v2k ≤ v1k for all k of interest and v2k<v1k for at least one k of interest.

Admissibility A design is said to be admissible within a class of designs if there is no design in the class that dominates it. A design that is dominated by another design in its class is said to inadmissible. In our example, Design 1 is inadmissible among the class of designs that use 12 chickens and 6 slides because it is dominated by Design 2. Design 2 is also inadmissible in the class of designs that use 12 chickens and 6 slides. Can you find a design that dominates it?

Alternative Designs 3 4 5 6 μ+α+β+θ μ+α μ+α+β+θ μ+α μ+β μ μ+β μ

> x1=matrix(c( + 1,0,-1,-1, + 1,-1,0,0, + 1,0,1,0, + 1,1,0,1, + 1,-1,-1,-1, + 1,-1,1,0), + byrow=T,nrow=6) > x2=x1 > x2[5,]=c(1,-1,0,-1) > x2[6,]=c(1,1,0,0)

> x3=matrix(c( + 1,1,0,1, + 1,-1,0,-1, + 1,-1,0,0, + 1,1,0,0, + 1,-1,0,0), + byrow=T,nrow=6) > x4=x3 > x4[,4]=c(1,-1,0,0,0,0) > x5=x4 > x5[2,4]=0 > x6=x5 > x6[,4]=c(1,-1,1,-1,1,0)

> x1 [,1] [,2] [,3] [,4] [1,] 1 0 -1 -1 [2,] 1 -1 0 0 [3,] 1 0 1 0 [4,] 1 1 0 1 [5,] 1 -1 -1 -1 [6,] 1 -1 1 0 > x2 [5,] 1 -1 0 -1 [6,] 1 1 0 0

> x3 [,1] [,2] [,3] [,4] [1,] 1 1 0 1 [2,] 1 -1 0 -1 [3,] 1 1 0 1 [4,] 1 -1 0 0 [5,] 1 1 0 0 [6,] 1 -1 0 0 > x4 [3,] 1 1 0 0

> x5 [,1] [,2] [,3] [,4] [1,] 1 1 0 1 [2,] 1 -1 0 0 [3,] 1 1 0 0 [4,] 1 -1 0 0 [5,] 1 1 0 0 [6,] 1 -1 0 0 > x6 [2,] 1 -1 0 -1 [3,] 1 1 0 1 [4,] 1 -1 0 -1 [5,] 1 1 0 1

Alternative Designs 3 4 5 6 μ+α+β+θ μ+α μ+α+β+θ μ+α μ+β μ μ+β μ

> solve(t(x1)%*%x1) [,1] [,2] [,3] [,4] [1,] 0.2 0.10 0.00 0.0 [2,] 0.1 0.55 0.25 -0.5 [3,] 0.0 0.25 0.50 -0.5 [4,] 0.0 -0.50 -0.50 1.0 > solve(t(x2)%*%x2) [,1] [,2] [,3] [,4] [1,] 0.1875 -0.0625 -0.0625 0.125 [2,] -0.0625 0.4375 0.1875 -0.375 [3,] -0.0625 0.1875 0.6875 -0.375 [4,] 0.1250 -0.3750 -0.3750 0.750 > solve(t(x3[,-3])%*%x3[,-3]) [,1] [,2] [,3] [1,] 0.1875 0.0625000 -0.125 [2,] 0.0625 0.3541667 -0.375 [3,] -0.1250 -0.3750000 0.750

Alternative Designs 3 4 5 6 μ+α+β+θ μ+α μ+α+β+θ μ+α μ+β μ μ+β μ

> solve(t(x3[,-3])%*%x3[,-3]) [,1] [,2] [,3] [1,] 0.1875 0.0625000 -0.125 [2,] 0.0625 0.3541667 -0.375 [3,] -0.1250 -0.3750000 0.750 > solve(t(x4[,-3])%*%x4[,-3]) [,1] [,2] [,3] [1,] 0.1666667 0.00 0.00 [2,] 0.0000000 0.25 -0.25 [3,] 0.0000000 -0.25 0.75

Alternative Designs 3 4 5 6 μ+α+β+θ μ+α μ+α+β+θ μ+α μ+β μ μ+β μ

> solve(t(x4[,-3])%*%x4[,-3]) [,1] [,2] [,3] [1,] 0.1666667 0.00 0.00 [2,] 0.0000000 0.25 -0.25 [3,] 0.0000000 -0.25 0.75 > solve(t(x5[,-3])%*%x5[,-3]) [,1] [,2] [,3] [1,] 0.20833333 0.04166667 -0.25 [2,] 0.04166667 0.20833333 -0.25 [3,] -0.25000000 -0.25000000 1.50 > solve(t(x6[,-3])%*%x6[,-3]) [,1] [,2] [,3] [1,] 0.2083333 0.2083333 -0.25 [2,] 0.2083333 1.2083333 -1.25 [3,] -0.2500000 -1.2500000 1.50

The concept of admissibility for two-color microarray experiments was introduced by Glonek, G. F. V. and Solomon, P. J. (2004). Factorial and time course designs for cDNA microarray experiments. Biostatistics, 5, 89-111.

Which of Design 1 or Design 2 would be better if our primary goal was to... test for a dye effect? test for a difference between mock and Salmonella infection for the susceptible breed? test for a difference between the resistant and susceptible breeds under mock infection? test for infection type main effects? test for breed main effects?

Normalized Log Expression μ+α 10 resistant μ+α+β+θ Normalized Log Expression 5 susceptible μ μ+β mock Salmonella Infection

Questions Restated in Terms of Model Parameters δ2-δ1=0? (1 0 0 0) β = 0 ? β=0? (0 0 1 0) β = 0 ? α+θ=0? (0 1 0 1) β = 0 ? β+θ/2=0? (0 0 1 .5) β = 0 ? α+θ/2=0? (0 1 0 .5) β = 0 ? δ2-δ1 α β θ β =

> g1=c(1,0,0,0) > t(g1)%*%solve(t(x1)%*%x1)%*%g1 [,1] [1,] 0.2 > t(g1)%*%solve(t(x2)%*%x2)%*%g1 [1,] 0.1875 > g2=c(0,0,1,0) > t(g2)%*%solve(t(x1)%*%x1)%*%g2 [1,] 0.5 > t(g2)%*%solve(t(x2)%*%x2)%*%g2 [1,] 0.6875

> g3=c(0,1,0,1) > t(g3)%*%solve(t(x1)%*%x1)%*%g3 [,1] [1,] 0.55 > t(g3)%*%solve(t(x2)%*%x2)%*%g3 [1,] 0.4375 > g4=c(0,0,1,.5) > t(g4)%*%solve(t(x1)%*%x1)%*%g4 [1,] 0.25 > t(g4)%*%solve(t(x2)%*%x2)%*%g4 [1,] 0.5

> g5=c(0,1,0,.5) > t(g5)%*%solve(t(x1)%*%x1)%*%g5 [,1] [1,] 0.3 > t(g5)%*%solve(t(x2)%*%x2)%*%g5 [1,] 0.25

Choice of Parameterization is Not Important The definition of the four treatment means as μ+α+β+θ, μ+β, μ+α, and μ and the choice of β=(δ2-δ1, α, β, θ)’ were just convenient choices for the sake of illustration. Any other equivalent parameterization would lead to the same conclusions. It is not necessary to use an X matrix that has full column rank. Calculations could be based on comparisons of g’(X’X)-g, where (X’X)- is an generalized inverse of X’X.

2 x 3 Factorial Experiment Suppose factor A has levels A1 and A2. Suppose factor B has levels B1, B2, and B3. Suppose we have 12 experimental units and 6 two-color microarray slides. What design should we use if our primary objective is to test for interaction between factors A and B?

Means for the 2 x 3 Factorial Experiment μ+α+β2+θ2 A1 H0 : No Interaction μ+α+β1+θ1 H0 : α=α+θ1=α+θ2 μ+α Response H0 : θ1=θ2=0 μ+β1 μ A2 μ+β2 B1 B2 B3 Level of B

Means for the 2 x 3 Factorial Experiment μ+α+β2+θ2 A1 H0 : No Interaction μ+α+β1+θ1 H0 : β1=β1+θ1 and β2-β1=β2- β1+ θ2-θ1 β2= β2+θ2 μ+α Response μ+β1 μ H0 : θ1=θ2=0 A2 μ+β2 B1 B2 B3 Level of B

Multiple Regression Parameterization d = X β + r vector of residuals vector of red – green differences parameter vector design matrix δ2-δ1 α β1 β2 θ1 θ2 H0 : θ1=θ2=0 β = H0 : Mβ=0 where 0 0 0 0 1 0 0 0 0 0 0 1 M = =

^ Mβ. ^ Var(Mβ)=σ2M(X’X)-1M’ δ2-δ1 α H0 : θ1=θ2=0 β1 β2 β = θ1 θ2 H0 : Mβ=0 where 0 0 0 0 1 0 0 0 0 0 0 1 M = = The interaction parameters θ1 and θ2 are estimated by ^ Mβ. ^ Var(Mβ)=σ2M(X’X)-1M’

^ Var(Mβ)=σ2M(X’X)-1M’ Find design X so that the determinant of M(X’X)-1M’ is minimized.

Which design is preferred? μ+α+β1+θ1 μ+α+β2+θ2 μ+α 1 0 1 0 1 0 1 0 -1 1 -1 1 1 -1 0 0 0 -1 1 0 1 -1 0 0 1 0 -1 0 0 0 1 1 0 0 0 0 X1 = μ+β2 δ2-δ1 α β1 β2 θ1 θ2 μ μ+β1 β= 1 1 0 0 0 0 1 -1 0 0 0 0 1 1 0 0 1 0 1 -1 0 0 -1 0 1 1 0 0 0 1 1 -1 0 0 0 -1 μ+α μ+α+β1+θ1 μ+α+β2+θ2 X2 = μ μ+β1 μ+β2

> X1=matrix(c( + 1,1,1,1,1,1, + 0,0,-1,0,0,1, + 1,-1,0,1,-1,0, + 0,1,0,-1,0,0, + 1,-1,0,0,0,0, + 0,1,-1,0,0,0),nrow=6) > X2=matrix(c( + 1,-1,1,-1,1,-1, + 0,0,0,0,0,0, + 0,0,1,-1,0,0, + 0,0,0,0,1,-1),nrow=6)

> X1 [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 0 1 0 1 0 [2,] 1 0 -1 1 -1 1 [3,] 1 -1 0 0 0 -1 [4,] 1 0 1 -1 0 0 [5,] 1 0 -1 0 0 0 [6,] 1 1 0 0 0 0 > X2 [1,] 1 1 0 0 0 0 [2,] 1 -1 0 0 0 0 [3,] 1 1 0 0 1 0 [4,] 1 -1 0 0 -1 0 [5,] 1 1 0 0 0 1 [6,] 1 -1 0 0 0 -1

> M=matrix(c(0,0,0,0,0,0,0,0,1,0,0,1),nrow=2) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0 0 0 0 1 0 [2,] 0 0 0 0 0 1 > det(M%*%solve(t(X1)%*%X1)%*%t(M)) [1] 1.333333 > library(MASS) > det(M%*%ginv(t(X2)%*%X2)%*%t(M)) [1] 0.75

Three Treatment CRD with Loops 1 1 1 1 3 2 3 2 3 2 3 2 fixed Y = trt dye slide xu random

Three Treatment CRD with Loops 1 1 1 1 3 2 3 2 3 2 3 2 Yijkl =μ+τi+δj+sk+bl+eijkl

Consider Red – Green Differences for Each Slide Y2212 = μ+τ2+δ2+s1+b2+e2212 Y1111 = (μ+τ1+δ1+s1+b1+e1111) Y2212-Y1111 = δ2-δ1+τ2-τ1+b2-b1+e2212-e1111 - Y1131-Y3233 = δ2-δ1+τ1-τ3+b1-b3+e1131-e3233 1 1 1 1 Y3223-Y2222 = δ2-δ1+τ3-τ2+b3-b2+e3223-e2222 3 2 3 2 3 2 3 2

Variance of Red-Green Differences is Constant Var(Y2212-Y1111) = Var(δ2-δ1+τ2-τ1+b2-b1+e2212-e1111) =Var(b2-b1+e2212-e1111) =σb+σb+σe+σe 2 2 2 2 =2σb+2σe 2 2 This variance is the same for the difference from each slide.

Covariance between Red-Green Differences from the Same Loop Cov(Y2212-Y1111 , Y3223-Y2222 ) =Cov(b2-b1+e2212-e1111 , b3-b2+e3223-e2222) =Cov(b2 , -b2) = -σb 2 This covariance is the same for all pairs of differences that come from the same loop. The covariance between differences that come from different loops is 0.

We can model the differences using a multiple linear regression model with correlated residual random effects. 1 1 1 1 d3 d1 d6 d4 d9 d7 d12 d10 3 2 3 2 3 2 3 2 d2 d5 d8 d11 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 1 1 0 1 0 1 1 -1 -1 1 -1 0 1 0 -1 1 1 1 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 δ2-δ1 τ2-τ1 τ3-τ2 d = Xβ + r = +

Var(d) = Var(r) = V where A B B 0 0 0 0 0 0 0 0 0 B A B 0 0 0 0 0 0 0 0 0 B B A 0 0 0 0 0 0 0 0 0 0 0 0 A B B 0 0 0 0 0 0 0 0 0 B A B 0 0 0 0 0 0 0 0 0 B B A 0 0 0 0 0 0 0 0 0 0 0 0 A B B 0 0 0 0 0 0 0 0 0 B A B 0 0 0 0 0 0 0 0 0 B B A 0 0 0 0 0 0 0 0 0 0 0 0 A B B 0 0 0 0 0 0 0 0 0 B A B 0 0 0 0 0 0 0 0 0 B B A A = 2σb+2σe 2 2 V = B = -σb 2

If V is known The best linear unbiased estimator of β is β =(X’V-1X)-1 X’V-1d. β is normally distributed with mean β and var (X’V-1X)-1 Designs could be compared by examining (X’V-1X)-1 for various choices of X. Note however that the assessment of a design might depend on the variance components in V. In reality these variance component parameters are unknown and must be estimated from the data. ^ ^